home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "MEMBER" and issue = 702373263 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • spencerkclark 2

issue 1

  • assign_coords with datetime64[us] changes dtype to datetime64[ns] · 2 ✖

author_association 1

  • MEMBER · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
696423669 https://github.com/pydata/xarray/issues/4427#issuecomment-696423669 https://api.github.com/repos/pydata/xarray/issues/4427 MDEyOklzc3VlQ29tbWVudDY5NjQyMzY2OQ== spencerkclark 6628425 2020-09-21T23:00:54Z 2020-09-21T23:04:56Z MEMBER

That would be great @andrewpauling! I think this is the relevant code in xarray: https://github.com/pydata/xarray/blob/1155f5646e07100e4acda18db074b148f1213b5d/xarray/core/variable.py#L244-L250

I want to say arguably we could use the _possibly_convert_objects function on datetime64 and timedelta64 data as well; you'll see this goes through a pandas.Series to do the casting, which has built-in logic to check that the values can be represented with nanosecond precision. But it's up to you how you ultimately want to go about things.

I agree this casting behavior is a bit surprising. If we wanted to be a little more transparent, we could also warn when attempting to cast non-nanosecond-precision datetime64 data to nanosecond precision. I'm not sure what others think; I know pandas doesn't do this, but it could be friendlier for users.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  assign_coords with datetime64[us] changes dtype to datetime64[ns] 702373263
695436235 https://github.com/pydata/xarray/issues/4427#issuecomment-695436235 https://api.github.com/repos/pydata/xarray/issues/4427 MDEyOklzc3VlQ29tbWVudDY5NTQzNjIzNQ== spencerkclark 6628425 2020-09-20T01:23:47Z 2020-09-20T01:23:47Z MEMBER

Thanks @andrewpauling -- I do think there's a bug here, but this issue happens to be more complicated than it might seem on the surface :).

Xarray standardizes around nanosecond precision for np.datetime64 dtypes, and casts any NumPy array of dtype datetime64 to nanosecond precision. This is mainly motivated by pandas -- pandas requires nanosecond precision -- which xarray relies on for time indexing and other time-related operations through things like pandas.DatetimeIndex or the pandas.Series.dt accessor. As you've noted this is unfortunate since it limits the supported time range for np.datetime64 types (see, e.g., discussion in https://github.com/pydata/xarray/issues/789).

Addressing this fully would be a challenge (we've discussed this at times in the past). It was concluded that for dates outside the representable range that cftime dates would be used, and that over time we would build up infrastructure to enable some of the nice things you can do with np.datetime64 types with cftime objects. The functionality now largely exists, and a nice benefit of doing this through cftime is that we also gain compatibility with non-standard calendar types, e.g. DatetimeNoLeap. I encourage you to try and take advantage of that, and please let us know if there is something missing that you would like to see implemented or improved!

This is a long way of saying, without a fair amount of work (i.e. addressing this issue upstream in pandas) xarray is unlikely to relax its approach for the precision of np.datetime64 dtypes, and will continue casting to nanosecond precision.

However, the fact that your example silently results in non-sensical times should be considered a bug; instead, following pandas, I would argue we should raise an error if the dates cannot be represented with nanosecond precision.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  assign_coords with datetime64[us] changes dtype to datetime64[ns] 702373263

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.34ms · About: xarray-datasette