html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/7210#issuecomment-1292699390,https://api.github.com/repos/pydata/xarray/issues/7210,1292699390,IC_kwDOAMm_X85NDQb-,89428916,2022-10-26T21:56:38Z,2022-10-27T15:30:57Z,NONE,"I have two options for workarounds that I'll share. Both use monkey patching to override functions in `xarray.coding.times` so I don't recommend actually using them but they may spur some conversation around if/how xarray may want to adopt them. # Option 1: This truly avoids the problem and ""fixes"" the timestamp in the attribute so that pandas can read it. This is probably a bit specific to this situation, but it works. What it does is use `cftime.num2date` to parse the `units` attribute and get the reference date used in the `units`. Then it creates a new string using an ISO-compliant date and calls the original `decode_cf_datetime`, which ends up using pandas. ``` import xarray as xr import xarray.coding.times import numpy as np import cftime orig_decode_cf_datetime = xarray.coding.times.decode_cf_datetime def decode_cf_datetime(num_dates, units, calendar=None, use_cftime=None): if cftime is not None: reference_time = cftime.num2date(0, units, calendar) units = f""{units.split('since')[0]} since {reference_time}"" return orig_decode_cf_datetime(num_dates, units, calendar, use_cftime) xarray.coding.times.decode_cf_datetime = decode_cf_datetime fill_val = -99999.0 time_vals = np.random.randint(0, 1000, 10) time_vals[1] = fill_val data_vars = { 'foo': (['x'], np.random.rand(10)), 'time': ( ['x'], time_vals, { 'units': 'seconds since 2000-1-1 0:0:0 0', '_FillValue': fill_val, 'scale_factor': 1.0, 'add_offset': 0.0, 'standard_name': 'time', 'calendar': 'standard', 'Axis': 'T', 'coverage_content_type': 'coordinate', } ), } ds = xr.Dataset( data_vars=data_vars, coords={'x': (['x'], np.arange(10))} ) nc_out_location = '/tmp/example.nc' ds.to_netcdf(nc_out_location) ds = xr.open_dataset(nc_out_location) print(ds['time']) ```
Console Output ``` array(['2000-01-01T00:03:18.000000000', 'NaT', '2000-01-01T00:06:58.000000000', '2000-01-01T00:07:32.000000000', '2000-01-01T00:09:07.000000000', '2000-01-01T00:04:28.000000000', '2000-01-01T00:11:04.000000000', '2000-01-01T00:12:42.000000000', '2000-01-01T00:05:03.000000000', '2000-01-01T00:11:07.000000000'], dtype='datetime64[ns]') Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 Attributes: standard_name: time Axis: T coverage_content_type: coordinate ```
# Option 2: This option tries to get smart before handing the dates off to `cftime` by replacing `NaN` with `0` before calling `cftime.num2date`. Then after the dates are converted it puts the `NaN`s back where they were. ``` import xarray as xr import xarray.coding.times import numpy as np import cftime def _decode_datetime_with_cftime(num_dates, units, calendar): if cftime is None: raise ModuleNotFoundError(""No module named 'cftime'"") indeces_of_nan = np.argwhere(np.isnan(num_dates)) num_dates.put(indeces_of_nan, 0) as_dates = np.asarray(cftime.num2date(num_dates, units, calendar)) as_dates.put(indeces_of_nan, np.nan) return as_dates xarray.coding.times._decode_datetime_with_cftime = _decode_datetime_with_cftime fill_val = -99999.0 time_vals = np.random.randint(0, 1000, 10) time_vals[1] = fill_val data_vars = { 'foo': (['x'], np.random.rand(10)), 'time': ( ['x'], time_vals, { 'units': 'seconds since 2000-1-1 0:0:0 0', '_FillValue': fill_val, 'scale_factor': 1.0, 'add_offset': 0.0, 'standard_name': 'time', 'calendar': 'standard', 'Axis': 'T', 'coverage_content_type': 'coordinate', } ), } ds = xr.Dataset( data_vars=data_vars, coords={'x': (['x'], np.arange(10))} ) nc_out_location = '/tmp/example.nc' ds.to_netcdf(nc_out_location) ds = xr.open_dataset(nc_out_location, use_cftime=True) print(ds['time']) ```
Console Output ``` array([cftime.DatetimeGregorian(2000, 1, 1, 0, 12, 33, 0, has_year_zero=False), nan, cftime.DatetimeGregorian(2000, 1, 1, 0, 10, 50, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 6, 27, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 0, 44, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 11, 18, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 0, 35, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 12, 37, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 14, 44, 0, has_year_zero=False), cftime.DatetimeGregorian(2000, 1, 1, 0, 11, 17, 0, has_year_zero=False)], dtype=object) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8 9 Attributes: standard_name: time Axis: T coverage_content_type: coordinate ```
There is one more problem with this solution though which is this (https://github.com/pydata/xarray/blob/076bd8e15f04878d7b97100fb29177697018138f/xarray/coding/times.py#L281-L296) validation code that happens after the `_decode_datetime_with_cftime` is called. That code calls `cftime_to_nptime` which does not handle `NaN`s. However, that validation code can be avoided by explicitly setting `use_cftime=True` which then works as expected.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1421718311 https://github.com/pydata/xarray/issues/7210#issuecomment-1290989864,https://api.github.com/repos/pydata/xarray/issues/7210,1290989864,IC_kwDOAMm_X85M8vEo,11022336,2022-10-25T18:41:19Z,2022-10-25T18:41:19Z,NONE,"After further investigation, this issue seems to arise if any if the values in the `time` variable are masked, not just the first and last values. It looks like `cftime` doesn't support nans, which is why this error is occuring.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1421718311