home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 396285440

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
396285440 MDU6SXNzdWUzOTYyODU0NDA= 2656 dataset info in .json format 1197350 closed 0     9 2019-01-06T19:13:34Z 2020-01-08T22:43:25Z 2019-01-21T23:25:56Z MEMBER      

I am exploring the world of Spatio Temporal Asset Catalogs (STAC), in which all datasets are described using json/ geojson:

The STAC specification aims to standardize the way geospatial assets are exposed online and queried.

I am thinking about how to put the sort of datasets that xarray deals with into STAC items (see https://github.com/radiantearth/stac-spec). This would be particular valuable in the context of Pangeo and the zarr-based datasets we have been putting in cloud storage.

For this purpose, it would be very useful to have a concise summary of an xarray dataset's contents (minus the actual data) in .json format. I'm talking about the kind of info we currently get from the .info() method, which is designed to mirror the CDL output of ncdump -h.

For example python ds = xr.Dataset({'foo': ('x', np.ones(10, 'f8'), {'units': 'm s-1'})}, {'x': ('x', np.arange(10), {'units': 'm'})}, {'conventions': 'made up'}) ds.info() ``` xarray.Dataset { dimensions: x = 10 ;

variables: float64 foo(x) ; foo:units = m s-1 ; int64 x(x) ; x:units = m ;

// global attributes: :conventions = made up ; ```

I would like to be able to do ds.info(format='json') and see something like this { "coords": { "x": { "dims": [ "x" ], "attrs": { "units": "m" } } }, "attrs": { "conventions": "made up" }, "dims": { "x": 10 }, "data_vars": { "foo": { "dims": [ "x" ], "attrs": { "units": "m s-1" } } } }

Which is what I get by doing print(json.dumps(ds.to_dict(), indent=2)) and manually stripping out all the data fields. So an alternative api might be something like ds.to_dict(data=False).

If anyone is aware of an existing spec for expressing Common Data Language in json, we should probably use that instead of inventing our own. But I think some version of this would be a very useful addition to xarray.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2656/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 9 rows from issue in issue_comments
Powered by Datasette · Queries took 76.135ms · About: xarray-datasette