home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 963688125 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • aidanheerdegen 4
  • spencerkclark 3
  • shoyer 1

author_association 2

  • CONTRIBUTOR 4
  • MEMBER 4

issue 1

  • xindexes set incorrectly for mfdataset with dask client and parallel=True · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
895624119 https://github.com/pydata/xarray/issues/5686#issuecomment-895624119 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841YiO3 aidanheerdegen 6063709 2021-08-09T23:44:10Z 2021-08-09T23:44:10Z CONTRIBUTOR

Thanks for the super fast fix. I have confirmed this fixes #5677

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895596762 https://github.com/pydata/xarray/issues/5686#issuecomment-895596762 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841Ybja spencerkclark 6628425 2021-08-09T22:36:53Z 2021-08-09T22:36:53Z MEMBER

@aidanheerdegen I tested your example with my upstream changes (Unidata/cftime#252) and it now works, so this should be fixed with the next release of cftime. I'll go ahead and close this issue. Thanks for your help debugging this!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895162608 https://github.com/pydata/xarray/issues/5686#issuecomment-895162608 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841Wxjw spencerkclark 6628425 2021-08-09T11:57:39Z 2021-08-09T11:57:39Z MEMBER

A colleague suggested it might be some sort of pickling issue, passing the generated object back to the main thread, but it was just speculation and I had no idea how to test that.

Yes, it must be something of that sort. Here's perhaps an even more minimal example: ```

import cftime; import distributed header, frames = distributed.protocol.serialize(cftime.DatetimeNoLeap(2000, 1, 1)) distributed.protocol.deserialize(header, frames) cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='noleap', has_year_zero=True) Further, removing `distributed` from the mix, we can show this just using `pickle`: import pickle serialized = pickle.dumps(cftime.DatetimeNoLeap(2000, 1, 1)) deserialized = pickle.loads(serialized) deserialized cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='noleap', has_year_zero=True) ``` I'll make an issue in cftime.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895123758 https://github.com/pydata/xarray/issues/5686#issuecomment-895123758 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WoEu aidanheerdegen 6063709 2021-08-09T10:46:57Z 2021-08-09T10:46:57Z CONTRIBUTOR

A colleague suggested it might be some sort of pickling issue, passing the generated object back to the main thread, but it was just speculation and I had no idea how to test that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895122645 https://github.com/pydata/xarray/issues/5686#issuecomment-895122645 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WnzV aidanheerdegen 6063709 2021-08-09T10:44:42Z 2021-08-09T10:44:42Z CONTRIBUTOR

Thanks again for the prompt response @spencerkclark. Yes your MCVE is more (less?) M than mine. Thanks.

Perhaps I shouldn't have started a new issue, but it seemed the specific problem with .sel was just a knock on effect from this cftime issue.

I should have said in #5677 that as far as I could tell I was using cftime=1.5.0.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895120951 https://github.com/pydata/xarray/issues/5686#issuecomment-895120951 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WnY3 aidanheerdegen 6063709 2021-08-09T10:41:18Z 2021-08-09T10:41:18Z CONTRIBUTOR

Thanks for the updated report! Could you kindly share the full error traceback?

Sorry, see below python Traceback (most recent call last): File "/g/data/v45/aph502/helpdesk/fromgithub/20210804-Navid/mcve.py", line 28, in <module> assert (index_microseconds == xr.open_mfdataset('2???.nc', parallel=True).xindexes['time'].array.asi8).all() File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py", line 683, in asi8 [ File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/coding/cftimeindex.py", line 684, in <listcomp> _total_microseconds(exact_cftime_datetime_difference(epoch, date)) File "/g/data3/hh5/public/apps/miniconda3/envs/analysis3-21.07/lib/python3.9/site-packages/xarray/core/resample_cftime.py", line 358, in exact_cftime_datetime_difference seconds = b.replace(microsecond=0) - a.replace(microsecond=0) File "src/cftime/_cftime.pyx", line 1369, in cftime._cftime.datetime.__sub__ TypeError: cannot compute the time difference between dates with different calendars

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895074385 https://github.com/pydata/xarray/issues/5686#issuecomment-895074385 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WcBR spencerkclark 6628425 2021-08-09T09:24:30Z 2021-08-09T09:24:30Z MEMBER

@aidanheerdegen thanks a lot for the minimal example -- I'm able to reproduce it now -- this is indeed a confusing bug! I think this is a symptom of the same problem I identified in #5677: the dates in your dataset are being decoded to cftime.datetime objects instead of cftime.DatetimeNoLeap objects. This causes problems downstream in a variety of places in xarray, because xarray currently infers the date and calendar type of the index by checking the type of the dates it contains.

The question is: why is this happening? I initially thought this could only happen if you were using cftime version 1.4.0, but that is clearly not true. It is interesting that this only comes up when using a distributed client and parallel=True in open_mfdataset, while with other more basic approaches the dates are decoded properly to cftime.DatetimeNoLeap objects. I think a more minimal example of this issue may be the following: ```

import cftime; import dask; import distributed cftime.version '1.5.0' dask.version '2021.07.2' distributed.version '2021.07.2' client = distributed.Client() cftime.num2date([0, 1, 2], units="days since 2000-01-01", calendar="noleap") array([cftime.DatetimeNoLeap(2000, 1, 1, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2000, 1, 2, 0, 0, 0, 0, has_year_zero=True), cftime.DatetimeNoLeap(2000, 1, 3, 0, 0, 0, 0, has_year_zero=True)], dtype=object) delayed_num2date = dask.delayed(cftime.num2date) delayed_num2date([0, 1, 2], units="days since 2000-01-01", calendar="noleap").compute() array([cftime.datetime(2000, 1, 1, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2000, 1, 2, 0, 0, 0, 0, calendar='noleap', has_year_zero=True), cftime.datetime(2000, 1, 3, 0, 0, 0, 0, calendar='noleap', has_year_zero=True)], dtype=object) `` Note that when usingdelayedin conjunction with adistributed.Clientthe dates are decoded tocftime.datetimeobjects instead ofcftime.DatetimeNoLeap` objects. This fairly clearly demonstrates that this is an upstream issue -- likely in cftime -- but I need to dig a little more to determine exactly what the issue is.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125
895018706 https://github.com/pydata/xarray/issues/5686#issuecomment-895018706 https://api.github.com/repos/pydata/xarray/issues/5686 IC_kwDOAMm_X841WObS shoyer 1217238 2021-08-09T07:46:39Z 2021-08-09T07:46:39Z MEMBER

Thanks for the updated report! Could you kindly share the full error traceback?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xindexes set incorrectly for mfdataset with dask client and parallel=True 963688125

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.548ms · About: xarray-datasette