home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where state = "open" and user = 12237157 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 2

state 1

  • open · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1214290591 I_kwDOAMm_X85IYJqf 6510 Feature request: raise more informative error message for `xr.open_dataset(list_of_paths)` aaronspring 12237157 open 0     4 2022-04-25T10:22:25Z 2022-04-29T16:47:56Z   CONTRIBUTOR      

Is your feature request related to a problem?

I sometimes use xr.open_dataset instead of xr.open_mfdataset on multiple paths. I propose to raise a more informative error message than ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'cfgrib']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see: https://docs.xarray.dev/en/stable/getting-started-guide/installing.html https://docs.xarray.dev/en/stable/user-guide/io.html.

```python import xarray as xr

xr.version # '2022.3.0'

ds = xr.tutorial.load_dataset("air_temperature")

ds.isel(time=slice(None,1500)).to_netcdf("file1.nc") ds.isel(time=slice(1500,None)).to_netcdf("file2.nc")

xr.open_mfdataset(["file1.nc","file2.nc"]) # works xr.open_mfdataset("file?.nc") # works

I understand what I need to do here

xr.open_dataset("file?.nc") # fails FileNotFoundError: No such file or directory: b'/dir/file?.nc'

I dont here; I also first try to check whether one of these files is corrupt

xr.open_dataset(["file1.nc","file2.nc"]) # fails ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'cfgrib']. Consider explicitly selecting one of the installed engines via the engine parameter, or installing additional IO dependencies, see: links ```

Describe the solution you'd like

directing the user towards the solution, i.e. "found path as list, please use open_mfdataset"

Describe alternatives you've considered

No response

Additional context

No response

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6510/reactions",
    "total_count": 6,
    "+1": 6,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1120583442 I_kwDOAMm_X85Cyr8S 6230 [PERFORMANCE]: `isin` on `CFTimeIndex`-backed `Coordinate` slow aaronspring 12237157 open 0     5 2022-02-01T12:04:02Z 2022-02-07T23:40:48Z   CONTRIBUTOR      

Is your feature request related to a problem?

I want to do coord1.isin.coord2 and it is quite slow when coords are large and of object type CFTimeIndex.

```python import xarray as xr import numpy as np

n=1000 coord1 = xr.cftime_range(start='2000', freq='MS', periods=n) coord2 = xr.cftime_range(start='2000', freq='3MS', periods=n)

cftimeindex: very fast

%timeit coord1.isin(coord2) # 743 µs ± 1.33 µs

np.isin on index.asi8

%timeit np.isin(coord1.asi8,coord2.asi8) # 7.83 ms ± 14.1 µs

da = xr.DataArray(np.random.random((n,n)),dims=['a','b'],coords={'a':coord1,'b':coord2})

when xr.DataArray coordinate slow

%timeit da.a.isin(da.b) # 94.9 ms ± 959 µs

when converting xr.DataArray coordinate back to index slow

%timeit np.isin(da.a.to_index(), da.b.to_index()) # 97.4 ms ± 819 µs

when converting xr.DataArray coordinate back to index asi

%timeit np.isin(da.a.to_index().asi8, da.b.to_index().asi8) # 7.89 ms ± 15.2 µs ```

Describe the solution you'd like

faster coord1.isin.coord2 by default. could we re-route here, e.g. to the alternative?

conversion from coordinate to_index() is costly I guess

Describe alternatives you've considered

np.isin(coord1.to_index().asi8, coord2.to_index().asi8 brings me nice speedups in https://github.com/pangeo-data/climpred/pull/724

Additional context

unsure whether this issue should go here on in cftime

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6230/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 21.425ms · About: xarray-datasette