home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where state = "closed" and user = 20627856 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 1

state 1

  • closed · 1 ✖

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
766826777 MDU6SXNzdWU3NjY4MjY3Nzc= 4691 Non-HTTPS remote URLs no longer work as input for open_zarr charlesbluca 20627856 closed 0     5 2020-12-14T19:00:19Z 2021-02-28T04:48:52Z 2021-02-28T04:48:52Z NONE      

What happened:

On 0.16.2 and later, passing a non-HTTPS remote URL path (e.g. gs://...) as input to open_zarr() results in a KeyError or GroupNotFoundError:

```python

import xarray as xr xr.open_zarr("gs://cmip6/AerChemMIP/AS-RCEC/TaiESM1/histSST/r1i1p1f1/AERmon/od550aer/gn/", consolidated=True) KeyError: '.zmetadata' xr.open_zarr("gs://cmip6/AerChemMIP/AS-RCEC/TaiESM1/histSST/r1i1p1f1/AERmon/od550aer/gn/", consolidated=False) GroupNotFoundError: group not found at path '' ```

What you expected to happen:

With versions 0.16.1 and earlier, passing a non-HTTPS remote URL path to open_zarr() as input would successfully open the remote store, provided that a package to handle the specific filesystem was available in the environment and the proper storage options were supplied.

Minimal Complete Verifiable Example:

Same as above, but with decode_times=False to circumvent a cftime dependency:

```python import xarray as xr

xr.open_zarr( "gs://cmip6/AerChemMIP/AS-RCEC/TaiESM1/histSST/r1i1p1f1/AERmon/od550aer/gn/", consolidated=True, decode_times=False, ) ```

Anything else we need to know?:

From a brief debug of the code, it looks like this error is a result of open_zarr() now calling open_dataset(engine="zarr") to open the Zarr store.

In this function, the remote URL path is now passed through _normalize_path() where it is not recognized as a remote URL (this check is done by is_remote_uri() which only checks for HTTPS) and is instead interpreted as a relative path in the local filesystem, where it does not exist.

I'm not sure if this meant to be expected behavior, as the documentation on reading datasets in the cloud does not show an example using a URL path as input, and only suggests to use a MutableMapping. However, this is a use case that worked before 0.16.2, and now no longer works.

I think this could be resolved by expanding is_remote_uri() to check for other common remote URIs (e.g. gs:, s3:, etc.).

Environment:

Output of <tt>xr.show_versions()</tt> ``` INSTALLED VERSIONS ------------------ commit: None python: 3.9.1 | packaged by conda-forge | (default, Dec 9 2020, 01:07:06) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 142 Stepping 10, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: English_United States.1252 libhdf5: None libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.19.4 scipy: None netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.6.1 cftime: 1.3.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.1 conda: None pytest: None IPython: 7.19.0 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4691/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 21.966ms · About: xarray-datasette