home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where "closed_at" is on date 2019-06-25 and user = 35968931 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 1
  • pull 1

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
324350248 MDU6SXNzdWUzMjQzNTAyNDg= 2159 Concatenate across multiple dimensions with open_mfdataset TomNicholas 35968931 closed 0     27 2018-05-18T10:10:49Z 2019-09-16T18:54:39Z 2019-06-25T15:50:33Z MEMBER      

Code Sample

```python

Create 4 datasets containing sections of contiguous (x,y) data

for i, x in enumerate([1, 3]): for j, y in enumerate([10, 40]): ds = xr.Dataset({'foo': (('x', 'y'), np.ones((2, 3)))}, coords={'x': [x, x+1], 'y': [y, y+10, y+20]})

    ds.to_netcdf('ds.' + str(i) + str(j) + '.nc')

Try to open them all in one go

ds_read = xr.open_mfdataset('ds.*.nc') print(ds_read) ```

Problem description

Currently xr.open_mfdataset will detect a single common dimension and concatenate DataSets along that dimension. However a common use case is a set of NetCDF files which have two or more common dimensions that need to be concatenated along simultaneously (for example collecting the output of any large-scale simulation which parallelizes in more than one dimension simultaneously). For the behaviour of xr.open_mfdataset to be n-dimensional it should automatically recognise and concatenate along all common dimensions.

Expected Output

<xarray.Dataset> Dimensions: (x: 4, y: 6) Coordinates: * x (x) int64 1 2 3 4 * y (y) int64 10 20 30 40 50 60 Data variables: foo (x, y) float64 dask.array<shape=(4, 6), chunksize=(2, 3)>

Current output of xr.open_mfdataset()

<xarray.Dataset> Dimensions: (x: 4, y: 12) Coordinates: * x (x) int64 1 2 3 4 * y (y) int64 10 20 30 40 50 60 10 20 30 40 50 60 Data variables: foo (x, y) float64 dask.array<shape=(4, 12), chunksize=(4, 3)>

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2159/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
391865060 MDExOlB1bGxSZXF1ZXN0MjM5MjYzOTU5 2616 API for N-dimensional combine TomNicholas 35968931 closed 0     25 2018-12-17T19:51:32Z 2019-06-25T16:18:29Z 2019-06-25T15:14:34Z MEMBER   0 pydata/xarray/pulls/2616

Continues the discussion from #2553 about how the API for loading and combining data from multiple datasets should work. (Ultimately part of the solution to #2159)

@shoyer this is for you to see how I envisaged the API would look, based on our discussion in #2553. For now you can ignore all the changes except the ones to the docstrings of auto_combine here, manual_combine here and open_mfdataset here.

Feedback from anyone else is also encouraged, as really the point of this is to make the API as clear as possible to someone who hasn't delved into the code behind auto_combine and open_mfdataset.

It makes sense to first work out the API, then change the internal implementation to match, using the internal functions developed in #2553. Therefore the tasks include:

  • [x] Decide on API for 'auto_combine' and 'open_mfdataset'
  • [x] Appropriate documentation
  • [x] Write internal implementation of manual_combine
  • [x] Write internal implementation of auto-combine
  • [x] Update open_mfdataset to match
  • [x] Write and reorganise tests
  • [x] Automatically ordering of string and datetime coords
  • [x] What's new explaining changes
  • [x] Make sure auto_combine and manual_combine appear on the API page of the docs
  • [x] PEP8 compliance
  • [x] Python 3.5 compatibility
  • [x] AirSpeedVelocity tests for auto_combine
  • [x] Finish all TODOs
  • [x] Backwards-compatible API to start deprecation cycle
  • [x] Add examples from docstrings to main documentation pages
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2616/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1122.85ms · About: xarray-datasette