home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 363326726 and user = 6628425 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • spencerkclark · 7 ✖

issue 1

  • xarray potential inconstistencies with cftime · 7 ✖

author_association 1

  • MEMBER 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
787069302 https://github.com/pydata/xarray/issues/2437#issuecomment-787069302 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDc4NzA2OTMwMg== spencerkclark 6628425 2021-02-27T12:59:35Z 2021-02-27T12:59:35Z MEMBER

@hafez-ahmad yes, I'm trying to help, but in order to do that I need more information. What does 456852 represent?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
787065391 https://github.com/pydata/xarray/issues/2437#issuecomment-787065391 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDc4NzA2NTM5MQ== spencerkclark 6628425 2021-02-27T12:27:41Z 2021-02-27T12:55:36Z MEMBER

Thanks @keewis.

@hafez-ahmad by Julian date do you mean that the time coordinate represents "days since -4713-01-01T12:00:00" in a Julian calendar?

Once we know the units (expressed as "{time_unit} since {reference_date}") and the calendar of the time coordinate, we can convert it to datetime objects via something like the following: ```python

units = "days since -4713-01-01T12:00:00" calendar = "julian" ds["time"] = ds.time.assign_attrs(units=units, calendar=calendar) ds = xr.decode_cf(ds) ```

I'll admit though, with the values in your dataset, this assumption produces dates like cftime.DatetimeJulian(-4527, 1, 30, 12, 0, 0, 0), which feel unlikely to be correct. Perhaps you are using a different reference date?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
787059252 https://github.com/pydata/xarray/issues/2437#issuecomment-787059252 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDc4NzA1OTI1Mg== spencerkclark 6628425 2021-02-27T11:34:49Z 2021-02-27T11:34:49Z MEMBER

Could you show me what the output of ds.info() looks like for the dataset?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
786561230 https://github.com/pydata/xarray/issues/2437#issuecomment-786561230 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDc4NjU2MTIzMA== spencerkclark 6628425 2021-02-26T10:32:42Z 2021-02-26T10:32:42Z MEMBER

@hafez-ahmad could you provide more detail about your dataset? Does the "time" coordinate have associated "calendar" and "units" attributes?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
461831985 https://github.com/pydata/xarray/issues/2437#issuecomment-461831985 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDQ2MTgzMTk4NQ== spencerkclark 6628425 2019-02-08T15:05:38Z 2019-02-08T15:05:38Z MEMBER

With #2516 already in released versions of xarray, and #2593 and #2665 recently merged, this situation has been significantly improved. I think it is safe now to close this general issue. @sbiner thanks for starting this conversation; feel free to post other issues related to cftime if they come up.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
424469494 https://github.com/pydata/xarray/issues/2437#issuecomment-424469494 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDQyNDQ2OTQ5NA== spencerkclark 6628425 2018-09-25T19:23:25Z 2018-09-25T19:23:25Z MEMBER

@shoyer I agree that seems like a good idea at this stage. Now that there are a number of functions in xarray that do depend differences in dates (as @sbiner notes upsampling with resample, interp, and differentiate), which did not exist in the past, it is perhaps better that things error by default, rather than silently return potentially incorrect results if they have not yet been implemented for dates from non-standard calendars. Users can explicitly opt in to the old workaround if they feel it would be safe in their use cases.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726
424395224 https://github.com/pydata/xarray/issues/2437#issuecomment-424395224 https://api.github.com/repos/pydata/xarray/issues/2437 MDEyOklzc3VlQ29tbWVudDQyNDM5NTIyNA== spencerkclark 6628425 2018-09-25T15:41:49Z 2018-09-25T15:41:49Z MEMBER

@sbiner these are all reasonable points of confusion. The current behavior in xarray regarding non-standard calendars is complicated, and we are working toward improving the situation. I've tried to provide a recommended solution based on your example as well as some historical/future context. Apologies for the long-winded answer!

Recommendation

For accurate round-tripping of date types, I would recommend that you run your code to open the dataset with the xarray option enable_cftimeindex set to True (by default it is currently set to False; for why this is the case, see the note regarding the default behavior). In the case of your example this would look like: ``` In [1]: import cftime

In [2]: import numpy as np

In [3]: import xarray as xr

In [4]: units = 'days since 2000-02-25'

In [5]: times = cftime.num2date(np.arange(7), units=units, calendar='365_day')

In [6]: da = xr.DataArray(np.arange(7), coords=[times], dims=['time'], name='a')

In [7]: da.to_netcdf('data-noleap.nc')

In [8]: with xr.set_options(enable_cftimeindex=True): ...: cftimeindex_enabled = xr.open_dataset('data-noleap.nc') ...: Here we can see that the time index is a `CFTimeIndex`, and that the time coordinate contains instances of `cftime.DatetimeNoLeap` objects (as they were in the original DataArray we created): In [9]: cftimeindex_enabled.indexes['time'] Out[9]: CFTimeIndex([2000-02-25 00:00:00, 2000-02-26 00:00:00, 2000-02-27 00:00:00, 2000-02-28 00:00:00, 2000-03-01 00:00:00, 2000-03-02 00:00:00, 2000-03-03 00:00:00], dtype='object', name=u'time')

In [10]: cftimeindex_enabled.time[0] Out[10]: <xarray.DataArray 'time' ()> array(cftime._cftime.DatetimeNoLeap(2000, 2, 25, 0, 0, 0, 0, 6, 56), dtype=object) Coordinates: time object 2000-02-25 00:00:00 Note that `resample` along a CFTimeIndex has not been implemented yet (#2191). Attempting to do so will raise an error. If you are interested in computing something as simple as a time series of annual means using `resample`, then you could work around that in the meantime by using `groupby`, for example: In [11]: cftimeindex_enabled.groupby('time.year').mean('time') Out[11]: <xarray.Dataset> Dimensions: (year: 1) Coordinates: * year (year) int64 2000 Data variables: a (year) float64 3.0 ``` For more information on what is enabled and what is not enabled when using a CFTimeIndex for indexing, see this section in the documentation.

Default behavior

The default behavior can be traced back to the early days of xarray (see the original discussion in #118, #121, and #126). It boils down to coercing any dates decoded into cftime.datetime objects (formerly netCDF4.datetime) into np.datetime64[ns] objects whenever possible. If this coercion is not possible (e.g. a date in the file is not allowed in the standard calendar like 2000-02-30 in the case of a 360-day calendar, or a date has a year outside the range 1678-2262) then cftime.datetime objects are allowed to remain. In other words, by default xarray indeed does use cftime to decode the dates; however, after decoding, it will try its hardest to convert those dates into a friendly type for pandas.

The advantage of the default approach is that, when possible, it allows you to take advantage of all the nice features that a time coordinate indexed by a pandas.DatetimeIndex provides (like resample). A disadvantage is that the dates in memory may not have the same calendar type as those encoded in the file (e.g. if the dates in the file are from a non-standard calendar, like no leap). For operations that rely on computing differences between dates (e.g.differentiate orinterp involving a time coordinate), this can lead to subtle (and silent) errors. Therefore when using dates coerced into a DatetimeIndex from a non-standard calendar, one should use caution to only do operations that are independent of the calendar type (one notable exception here is that xarray does make an effort to encode these dates accurately when writing out to a netCDF file).

Connecting back to your example, we can see that if we don't open the dataset with enable_cftimeindex=True, the dates are coerced to np.datetime64 objects and a DatetimeIndex is used: ``` In [12]: default = xr.open_dataset('data-noleap.nc')

In [13]: default.indexes['time'] Out[13]: DatetimeIndex(['2000-02-25', '2000-02-26', '2000-02-27', '2000-02-28', '2000-03-01', '2000-03-02', '2000-03-03'], dtype='datetime64[ns]', name=u'time', freq=None)

In [14]: default.time[0] Out[14]: <xarray.DataArray 'time' ()> array(951436800000000000L, dtype='datetime64[ns]') Coordinates: time datetime64[ns] 2000-02-25 In this case, as noted above, `resample` works: In [15]: default.resample(time='Y').mean('time') Out[15]: <xarray.Dataset> Dimensions: (time: 1) Coordinates: * time (time) datetime64[ns] 2000-12-31 Data variables: a (time) float64 3.0 ```

Future behavior

In xarray we are slowly working towards better support for operations involving cftime.datetime objects (see #789, #1084, #1252, #2008, #2142, #2301, #2434). Eventually we would like to switch to using enable_cftimeindex=True as the default: in that case the behavior would be to use np.datetime64 objects (associated with DatetimeIndexes) only for standard calendars, and cftime.datetime objects (associated with CFTimeIndexes) for any other calendar types.

The two major outstanding issues on this front are probably: - Adding resample functionality to CFTimeIndex (#2191) - Plotting data with cftime.datetime coordinate axes in matplotlib or holoviews (#2164).

Once those two remaining issues are addressed, one should be able to do most of the significant things one can do with np.datetime64 dates with cftime.datetime dates (and therefore changing the default behavior would be justified).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray potential inconstistencies with cftime 363326726

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 34.687ms · About: xarray-datasette