home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 299819380

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1399#issuecomment-299819380 https://api.github.com/repos/pydata/xarray/issues/1399 299819380 MDEyOklzc3VlQ29tbWVudDI5OTgxOTM4MA== 102827 2017-05-08T09:32:58Z 2017-05-08T09:32:58Z CONTRIBUTOR

Hmm... The "nanosecond"-issue seems to need a fix very much at the foundation. As long as pandas and xarray rely on datetime64[ns] you cannot avoid nanoseconds, right? pd.to_datetime() forces the conversion to nanoscends even if you pass integers but for a time unit different to ns. This does not make me as nervous as Fabien since my data is always quite recent, but I see that this is far from ideal for a tool for climate scientists.

An intermediate fix (@shoyer, do you actually want one?) that I could think of for the performance issue right now would be to do the conversion to datetime64[ns] depending on the time unit, e.g.

  • multiply raw values (most likely floats) with number of nanoseconds in time unit for units smaller then days (or hours?) and use these values as integers in pd.to_datetime()
  • else, fall back to using netCDF4/netcdftime for months and years (as suggested by shoyer) casting the raw values to floats

The only thing that bothers me is that I am not sure if the "number of nanoseconds" is always the same in every day or hour in the view of datetime64, due to leap seconds or other particularities.

@shoyer: Does this sound reasonable or did I forget to take into account any side effects?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  226549366
Powered by Datasette · Queries took 0.559ms · About: xarray-datasette