home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 1120276279 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • spencerkclark 2
  • antarcticrainforest 2
  • aidanheerdegen 1
  • mathause 1

author_association 2

  • CONTRIBUTOR 3
  • MEMBER 3

issue 1

  • open_mfdataset fails with cftime index when using parallel and dask delayed client · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1026812348 https://github.com/pydata/xarray/issues/6226#issuecomment-1026812348 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849M-m8 spencerkclark 6628425 2022-02-01T12:55:34Z 2022-02-01T12:55:34Z MEMBER

Awesome, thanks @antarcticrainforest -- yes, it will always have an object dtype.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
1026754743 https://github.com/pydata/xarray/issues/6226#issuecomment-1026754743 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849Mwi3 antarcticrainforest 10580038 2022-02-01T11:43:21Z 2022-02-01T11:44:32Z CONTRIBUTOR

Are we expecting the CFTimeIndex object to always have "O" as dtype? If so the solution would be straight forward. Which means I can create a PR.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
1026701337 https://github.com/pydata/xarray/issues/6226#issuecomment-1026701337 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849MjgZ spencerkclark 6628425 2022-02-01T10:39:28Z 2022-02-01T10:39:28Z MEMBER

Thanks @antarcticrainforest -- I think that's exactly what @mathause is getting at. It seems fairly safe to add a new keyword argument to CFTimeIndex.__new__, and the example @mathause uses would make a nice test. Would either of you be up to make a PR?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
1026656699 https://github.com/pydata/xarray/issues/6226#issuecomment-1026656699 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849MYm7 antarcticrainforest 10580038 2022-02-01T09:52:39Z 2022-02-01T09:52:39Z CONTRIBUTOR

I just ran into the very same issue. Are you sure that this is a problem with pandas? I've had a look into the pandas changes between 1.3.X and 1.4.X. Apparently the _new_Index method, which gets involved when serialising the index object, has been changed:

elif "dtype" not in d and "data" in d: # Prevent Index.__new__ from conducting inference; # "data" key not in RangeIndex d["dtype"] = d["data"].dtype return cls.__new__(cls, **d)

the problem is, that __new__ doesn't except a dtype argument. I've tried adding a dtype argument and it works. So I guess since this class inherits from pd.Index it needs to be updated?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
1026641184 https://github.com/pydata/xarray/issues/6226#issuecomment-1026641184 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849MU0g mathause 10194086 2022-02-01T09:34:48Z 2022-02-01T09:34:48Z MEMBER

Smaller repro

python import xarray as xr import pickle t = xr.cftime_range("20010101", "20010520") pickle.loads(pickle.dumps(t)) Looks like pandas now passes dtype on to __new__ which CFTimeIndex.__new__ does not accept:

https://github.com/pydata/xarray/blob/fe491b14b113c185b5b9a18e4f643e5a73208629/xarray/coding/cftimeindex.py#L313

Might be pandas-dev/pandas#43188. So CFTimeIndex.__new__ might need to accept dtype? @spencerkclark

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279
1026580368 https://github.com/pydata/xarray/issues/6226#issuecomment-1026580368 https://api.github.com/repos/pydata/xarray/issues/6226 IC_kwDOAMm_X849MF-Q aidanheerdegen 6063709 2022-02-01T08:19:46Z 2022-02-01T08:31:17Z CONTRIBUTOR

Update: It is pandas that is the critical package. Pinning distributed<2022.01.0, xarray<0.21.0 and cftime<1.5.2 didn't fix it, but adding pandas<1.4.0 makes the above test pass. Will now try unpinning other packages and confirm it is pandas that is the issue.

Edit: Confirmed it is pandas==1.4.0 that causes this issue. Following version combination does not produce this error: ``` INSTALLED VERSIONS


commit: None python: 3.9.10 | packaged by conda-forge | (main, Jan 30 2022, 18:04:04) [GCC 9.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-348.2.1.el8.nci.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4

xarray: 0.21.0 pandas: 1.3.5 numpy: 1.22.1 scipy: 1.7.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.13.1 h5py: 3.6.0 Nio: None zarr: 2.10.3 cftime: 1.5.2 nc_time_axis: 1.4.0 PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.10.0 iris: 3.1.0 bottleneck: 1.3.2 dask: 2022.01.1 distributed: 2022.01.1 matplotlib: 3.5.1 cartopy: 0.19.0.post1 seaborn: 0.11.2 numbagg: None fsspec: 2022.01.0 cupy: 10.1.0 pint: 0.18 sparse: 0.13.0 setuptools: 59.8.0 pip: 21.3.1 conda: 4.11.0 pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.4.0 ```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset fails with cftime index when using parallel and dask delayed client 1120276279

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.125ms · About: xarray-datasette