home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 211138272

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/827#issuecomment-211138272 https://api.github.com/repos/pydata/xarray/issues/827 211138272 MDEyOklzc3VlQ29tbWVudDIxMTEzODI3Mg== 1217238 2016-04-18T00:25:00Z 2016-04-18T00:25:00Z MEMBER

Ah, I finally figured out what's going on.

We use pandas to cleanup time units in an attempt to always write ISO-8601 compatible reference times. Unfortunately, pandas interprets dates like '1-1-1' or '01-JAN-0001' as January 1, 2001:

``` In [21]: pd.Timestamp('1-1-1 00:00:0.0') Out[21]: Timestamp('2001-01-01 00:00:00')

In [25]: pd.Timestamp('01-JAN-0001 00:00:00') Out[25]: Timestamp('2001-01-01 00:00:00') ```

One might argue this is a bug in pandas, but nonetheless that's what it does.

xarray can currently handle datetimes outside the range dates hangled by pandas (roughly 1700-2300), but only if pandas raises an OutOfBoundDatetime error.

Two fixes that we need for this: - use netCDF4's reference time decoding (if available) before trying to use pandas in decode_cf_datetime. Note that it is important to only decode only the one reference time if possible using netCDF4, because it's a lot faster to parse dates with vectorized operations with pandas/numpy. - stop using _cleanup_netcdf_time_units, since apparently it can go wrong.

cc @jhamman who has some experience with these issues

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  148876551
Powered by Datasette · Queries took 0.908ms · About: xarray-datasette