home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where repo = 13221727, state = "closed" and user = 3922329 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed · 1 ✖

repo 1

  • xarray · 1 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
546562676 MDU6SXNzdWU1NDY1NjI2NzY= 3668 open_mfdataset: support for multiple zarr datasets dmedv 3922329 closed 0     14 2020-01-07T23:29:41Z 2020-09-22T05:40:31Z 2020-09-22T05:40:31Z NONE      

I am running calculations on a remote Dask cluster. Some of the data is only available on the workers, not on the client. It is already possible to have an xarray dataset that "points" to a remote NetCDF data collection by using the parallel option with xarray.open_mfdataset() like this:

```python from dask.distributed import Client import xarray as xr

client = Client('<dask_scheduler_ip>:<port>') ds = xr.open_mfdataset(remote_nc_file_paths, combine='by_coords', parallel=True) ```

Then it will use dask.delayed and, for example, the following simple mean calculation will be distributed between the workers, the result returned to the client:

python ds['Temp'].mean().compute()

Unfortunately, I cannot do the same thing with zarr, because open_mfdataset() does not support it, and open_zarr() does not have an option to utilize dask.delayed. Would it be possible to add dask.delayed support to the zarr backend? Or, perhaps, I am missing something, and there is another better way to work with zarr data on a remote Dask cluster?

Output of xr.show_versions(): ``` INSTALLED VERSIONS


commit: None python: 3.6.7 |Anaconda custom (64-bit)| (default, Oct 23 2018, 19:16:44) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-862.2.3.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: None LOCALE: None.None libhdf5: 1.10.4 libnetcdf: 4.6.3

xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.3 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: 2.8.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.9.1 distributed: 2.9.1 matplotlib: 3.1.2 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 40.4.3 pip: 18.1 conda: 4.8.0 pytest: 3.8.2 IPython: 7.0.1 sphinx: 1.8.1 ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3668/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 3596.591ms · About: xarray-datasette