home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1696097756

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1696097756 I_kwDOAMm_X85lGGXc 7817 nanosecond precision lost when reading time data 5821660 closed 0     3 2023-05-04T14:06:17Z 2023-09-17T08:15:27Z 2023-09-17T08:15:27Z MEMBER      

What happened?

When reading nanosecond precision time data from netcdf the precision is lost. This happens because CFMaskCoder will convert the variable to floating point and insert "NaN". In CFDatetimeCoder the floating point is cast back to int64 to transform into datetime64. This casting is sometimes undefined, hence #7098.

What did you expect to happen?

Precision should be preserved. The transformation to floating point should be omitted.

Minimal Complete Verifiable Example

```Python import xarray as xr import numpy as np import netCDF4 as nc import matplotlib.pyplot as plt

create time array and fillvalue

min_ns = -9223372036854775808 max_ns = 9223372036854775807 cnt = 2000 time_arr = np.arange(min_ns, min_ns + cnt, dtype=np.int64).astype("M8[ns]") fill_value = np.datetime64("1900-01-01", "ns")

create ncfile with time with attached _FillValue

with nc.Dataset("test.nc", mode="w") as ds: ds.createDimension("x", cnt) time = ds.createVariable("time", "<i8", ("x",), fill_value=fill_value) time[:] = time_arr time.units = "nanoseconds since 1970-01-01"

normal decoding

with xr.open_dataset("test.nc").load() as xr_ds: print("--- normal decoding ----------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, color="g", label="normal")

no decoding

with xr.open_dataset("test.nc", decode_cf=False).load() as xr_ds: print("--- no decoding ----------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values + max_ns, lw=5, color="b", label="raw")

do not decode times, this shows how the CFMaskCoder converts

the array to floating point before it would run CFDatetimeCoder

with xr.open_dataset("test.nc", decode_times=False).load() as xr_ds: print("--- no time decoding ----------------------") print(xr_ds["time"])

do not run CFMaskCoder to show that times will be converted nicely

with CFDatetimeCoder

with xr.open_dataset("test.nc", mask_and_scale=False).load() as xr_ds: print("--- no masking ------------------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, lw=2, color="r", label="nomask")

plt.legend() ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

Python --- normal decoding ---------------------- <xarray.DataArray 'time' (x: 2000)> array([ 'NaT', 'NaT', 'NaT', ..., '1677-09-21T00:12:43.145226240', '1677-09-21T00:12:43.145226240', '1677-09-21T00:12:43.145226240'], dtype='datetime64[ns]') Dimensions without coordinates: x --- no decoding ---------------------- <xarray.DataArray 'time' (x: 2000)> array([-9223372036854775808, -9223372036854775807, -9223372036854775806, ..., -9223372036854773811, -9223372036854773810, -9223372036854773809]) Dimensions without coordinates: x Attributes: _FillValue: -2208988800000000000 units: nanoseconds since 1970-01-01 --- no time decoding ---------------------- <xarray.DataArray 'time' (x: 2000)> array([-9.22337204e+18, -9.22337204e+18, -9.22337204e+18, ..., -9.22337204e+18, -9.22337204e+18, -9.22337204e+18]) Dimensions without coordinates: x Attributes: units: nanoseconds since 1970-01-01 --- no masking ------------------------------ <xarray.DataArray 'time' (x: 2000)> array([ 'NaT', '1677-09-21T00:12:43.145224193', '1677-09-21T00:12:43.145224194', ..., '1677-09-21T00:12:43.145226189', '1677-09-21T00:12:43.145226190', '1677-09-21T00:12:43.145226191'], dtype='datetime64[ns]') Dimensions without coordinates: x Attributes: _FillValue: -2208988800000000000

Anything else we need to know?

Plot from above code:

Xref: #7098, https://github.com/pydata/xarray/issues/7790#issuecomment-1531050846

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.14.21-150400.24.60-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: ('de_DE', 'UTF-8') libhdf5: 1.14.0 libnetcdf: 4.9.2 xarray: 2023.4.2 pandas: 2.0.1 numpy: 1.24.2 scipy: 1.10.1 netCDF4: 1.6.3 pydap: None h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.3.1 distributed: 2023.3.1 matplotlib: 3.7.1 cartopy: 0.21.1 seaborn: None numbagg: None fsspec: 2023.3.0 cupy: 11.6.0 pint: 0.20.1 sparse: None flox: None numpy_groupies: None setuptools: 67.6.0 pip: 23.0.1 conda: None pytest: 7.2.2 mypy: None IPython: 8.11.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7817/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 3 rows from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 0.983ms · About: xarray-datasette