home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where author_association = "MEMBER" and issue = 99836561 sorted by updated_at descending

✖
✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • rabernat 5
  • spencerkclark 4
  • jhamman 3

issue 1

  • time decoding error with "days since" · 12 ✖

author_association 1

  • MEMBER · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
474898722 https://github.com/pydata/xarray/issues/521#issuecomment-474898722 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDg5ODcyMg== rabernat 1197350 2019-03-20T15:55:15Z 2019-03-20T15:55:15Z MEMBER

@klindsay28 -- thanks for the clarification. You're clearly right about 2, and I was misinformed. The problem is that 3 makes it impossible follow the CF convention rules to overcome 2 (which xarray would try to do).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
474840796 https://github.com/pydata/xarray/issues/521#issuecomment-474840796 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDg0MDc5Ng== rabernat 1197350 2019-03-20T13:58:09Z 2019-03-20T13:58:46Z MEMBER

It's important to be clear that the issues 2 and 3 that @spencerkclark pointed out are objectively errors in the metadata. We have worked very hard over many years to enable xarray to correctly parse CF-compliant dates with non-standard calendars. But xarray cannot and should not be expected to magically fix metadata that is inconsistent or incomplete.

You really need to bring these issues to the attention of whoever generated some_CESM_output_file.nc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
474822193 https://github.com/pydata/xarray/issues/521#issuecomment-474822193 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDgyMjE5Mw== spencerkclark 6628425 2019-03-20T13:12:14Z 2019-03-20T13:12:14Z MEMBER

Now, this may still not work depending on the values in the the 'time_bound' variable (i.e. if any are less than 365.), because cftime currently does not support year zero in date objects (even for non-real-world calendars). I think one could make the argument that this is inconsistent with allowing reference dates with year zero for those date types, so it would probably be worth opening an issue there to try and get that fixed upstream.

I opened an issue in cftime regarding this: https://github.com/Unidata/cftime/issues/114.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
474813983 https://github.com/pydata/xarray/issues/521#issuecomment-474813983 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDgxMzk4Mw== spencerkclark 6628425 2019-03-20T12:47:39Z 2019-03-20T12:47:39Z MEMBER

Great that's helpful, thanks. I see what's happening now. There's a lot of tricky things going on, so bear with me.

Let's examine the output from ds.info() related to the time bounds and time variables: float64 time_bound(time, d2) ; time_bound:long_name = boundaries for time-averaging interval ; time_bound:units = days since 0000-01-01 00:00:00 ; float64 time(time) ; time:long_name = time ; time:units = days since 0-1-1 00:00:00 ; time:bounds = time_bnds ; time:calendar = 365_day ; time:standard_name = time ; time:axis = T ; There are a few important things to note: 1. In both the 'time_bound' and 'time' variables, the units attribute contains a reference date with year zero. 2. 'time' has a calendar attribute of '365_day', while a calendar attribute is not specified for the 'time_bound'. 3. 'time' has a 'bounds' attribute that points to a variable named 'time_bnds' instead of 'time_bound'.

For non-real-world calendars (e.g. 365_day), reference dates in cftime should allow year zero. This was fixed upstream in https://github.com/Unidata/netcdf4-python/pull/470. That being said, because of (2), the calendar for 'time_bound' is assumed to be a standard calendar; therefore you get this ValueError when decoding the times: ValueError: zero not allowed as a reference year, does not exist in Julian or Gregorian calendars Ultimately though, with https://github.com/pydata/xarray/pull/2571, we try to propagate the time-related attributes from the time coordinate to the associated bounds coordinate (so in normal circumstances we would use a 365_day calendar in this case as well). But, because of (3), this is not possible due to the fact that the 'bounds' attribute on the 'time' variable points to a variable name that does not exist.

In theory, another possible way to work around this would be to open the dataset with decode_times=False, add the appropriate calendar attribute to 'time_bound', and then decode the times: ds = xr.open_dataset('some_CESM_output_file.nc', decode_times=False) ds.time_bound.attrs['calendar'] = ds.time.attrs['calendar'] ds = xr.decode_cf(ds, use_cftime=True) Now, this may still not work depending on the values in the the 'time_bound' variable (i.e. if any are less than 365.), because cftime currently does not support year zero in date objects (even for non-real-world calendars). I think one could make the argument that this is inconsistent with allowing reference dates with year zero for those date types, so it would probably be worth opening an issue there to try and get that fixed upstream.

In conclusion, I'm afraid there is nothing we can do in xarray to automatically fix this situation. Issue (3) in the netCDF file is particularly unfortunate. If it weren't for that, I think all of these issues would be possible to work around, e.g. with https://github.com/pydata/xarray/pull/2571 here, or with fixes upstream.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
474601105 https://github.com/pydata/xarray/issues/521#issuecomment-474601105 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDYwMTEwNQ== spencerkclark 6628425 2019-03-19T21:59:08Z 2019-03-19T21:59:08Z MEMBER

Thanks -- in looking at the metadata it seems there is nothing unusual about the 'd2' dimension (in normal circumstances we should be able to decode N-D variables to dates, regardless of their type).

My feeling is that the issue here remains the fact that cftime dates do not support year zero (see the upstream issue @rabernat mentioned earlier: Unidata/netcdf4-python#442). That said, it's surprising that dropping the 'time_bounds' variable seems to be a workaround for this issue, because the 'time' variable (which remains in the dataset) still has units with a reference date of year zero.

If you don't mind, could you provide me with two more things? - What the time coordinate looks like after your workaround in https://github.com/pydata/xarray/issues/521#issuecomment-474580481 - What the traceback looks like if you try to open the file normally, e.g. ds = xr.open_dataset('some_CESM_output_file.nc')?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
474583572 https://github.com/pydata/xarray/issues/521#issuecomment-474583572 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDQ3NDU4MzU3Mg== spencerkclark 6628425 2019-03-19T21:03:37Z 2019-03-19T21:03:37Z MEMBER

Could you provide the output of ncdump -h or ds.info() on an example file?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129198927 https://github.com/pydata/xarray/issues/521#issuecomment-129198927 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTE5ODkyNw== jhamman 2443309 2015-08-09T15:29:52Z 2015-08-09T15:29:52Z MEMBER

Perhaps the long term fix is to implement non-standard calendars within numpy itself.

I agree, although that sounds like quite an undertaking. Maybe raise an issue over at numpy and ask if they would be interested in a multi-calendar api? If numpy could make it work, then I'm sure pandas could as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129189753 https://github.com/pydata/xarray/issues/521#issuecomment-129189753 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTE4OTc1Mw== rabernat 1197350 2015-08-09T14:00:37Z 2015-08-09T14:00:37Z MEMBER

@jhamman Thanks for the clear explanation! One of the main uses for non-standard calendars would be climate model "control runs", which don't occur any any specific point in historical time but still have seasonal cycles, well defined months, etc. It would be nice to have "group by" functionality for these datasets. But I do see how this is impossible with the current numpy datetime64 datatype. Perhaps the long term fix is to implement non-standard calendars within numpy itself.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129102650 https://github.com/pydata/xarray/issues/521#issuecomment-129102650 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTEwMjY1MA== jhamman 2443309 2015-08-09T04:03:34Z 2015-08-09T04:03:34Z MEMBER

We try to cast all the time variables to a pandas time index. This gives xray the ability to use many of the fast and fancy timeseries tools that pandas has. One consequence of that is that non-standard calendars, such as the "noleap" calendar must have dates inside the valid range of the standard calendars (1678 and 2226).

Does that make since? Ideally, numpy and pandas would support custom calendars but they don't so, at this point, we're bound to there limits.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129069915 https://github.com/pydata/xarray/issues/521#issuecomment-129069915 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTA2OTkxNQ== rabernat 1197350 2015-08-08T23:30:38Z 2015-08-08T23:30:38Z MEMBER

The PR above fixes this issue. However, since my model years are in the range 100-200, I am still getting the warning

RuntimeWarning: Unable to decode time axis into full numpy.datetime64 objects, continuing using dummy netCDF4.datetime objects instead, reason: dates out of range

and eventually when I try to access the time data, an error with a very long stack trace ending with

pandas/tslib.pyx in pandas.tslib.Timestamp.__new__ (pandas/tslib.c:7638)() pandas/tslib.pyx in pandas.tslib.convert_to_tsobject (pandas/tslib.c:21232)() pandas/tslib.pyx in pandas.tslib._check_dts_bounds (pandas/tslib.c:23332)() OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 100-02-01 00:00:00

I see there is a check in conventions.py that the year has to lie between 1678 and 2226. What is the reason for this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129059364 https://github.com/pydata/xarray/issues/521#issuecomment-129059364 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTA1OTM2NA== jhamman 2443309 2015-08-08T22:43:05Z 2015-08-08T22:43:05Z MEMBER

@rabernat -

Yes - this is all coming from the netCDF4.netcdftime module.

The work around with xray is to use ds = xray.open_dataset(filename, decode_times=False) then to fix up the time variable "manually". You can use xray.decode_cf() or simply assign a new pandas time index to your time variable.

As an aside, I also work with CESM output and this is a common problem with its netCDF output.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561
129054598 https://github.com/pydata/xarray/issues/521#issuecomment-129054598 https://api.github.com/repos/pydata/xarray/issues/521 MDEyOklzc3VlQ29tbWVudDEyOTA1NDU5OA== rabernat 1197350 2015-08-08T21:59:54Z 2015-08-08T21:59:54Z MEMBER

In fact I just found a netCDF issue on this topic! Apparently they don't think it should be supported. Unidata/netcdf4-python#442

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  time decoding error with "days since"  99836561

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.646ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows