home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 695436235

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4427#issuecomment-695436235 https://api.github.com/repos/pydata/xarray/issues/4427 695436235 MDEyOklzc3VlQ29tbWVudDY5NTQzNjIzNQ== 6628425 2020-09-20T01:23:47Z 2020-09-20T01:23:47Z MEMBER

Thanks @andrewpauling -- I do think there's a bug here, but this issue happens to be more complicated than it might seem on the surface :).

Xarray standardizes around nanosecond precision for np.datetime64 dtypes, and casts any NumPy array of dtype datetime64 to nanosecond precision. This is mainly motivated by pandas -- pandas requires nanosecond precision -- which xarray relies on for time indexing and other time-related operations through things like pandas.DatetimeIndex or the pandas.Series.dt accessor. As you've noted this is unfortunate since it limits the supported time range for np.datetime64 types (see, e.g., discussion in https://github.com/pydata/xarray/issues/789).

Addressing this fully would be a challenge (we've discussed this at times in the past). It was concluded that for dates outside the representable range that cftime dates would be used, and that over time we would build up infrastructure to enable some of the nice things you can do with np.datetime64 types with cftime objects. The functionality now largely exists, and a nice benefit of doing this through cftime is that we also gain compatibility with non-standard calendar types, e.g. DatetimeNoLeap. I encourage you to try and take advantage of that, and please let us know if there is something missing that you would like to see implemented or improved!

This is a long way of saying, without a fair amount of work (i.e. addressing this issue upstream in pandas) xarray is unlikely to relax its approach for the precision of np.datetime64 dtypes, and will continue casting to nanosecond precision.

However, the fact that your example silently results in non-sensical times should be considered a bug; instead, following pandas, I would argue we should raise an error if the dates cannot be represented with nanosecond precision.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  702373263
Powered by Datasette · Queries took 0.864ms · About: xarray-datasette