issues: 1696097756
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1696097756 | I_kwDOAMm_X85lGGXc | 7817 | nanosecond precision lost when reading time data | 5821660 | closed | 0 | 3 | 2023-05-04T14:06:17Z | 2023-09-17T08:15:27Z | 2023-09-17T08:15:27Z | MEMBER | What happened?When reading nanosecond precision time data from netcdf the precision is lost. This happens because CFMaskCoder will convert the variable to floating point and insert "NaN". In CFDatetimeCoder the floating point is cast back to int64 to transform into datetime64. This casting is sometimes undefined, hence #7098. What did you expect to happen?Precision should be preserved. The transformation to floating point should be omitted. Minimal Complete Verifiable Example```Python import xarray as xr import numpy as np import netCDF4 as nc import matplotlib.pyplot as plt create time array and fillvaluemin_ns = -9223372036854775808 max_ns = 9223372036854775807 cnt = 2000 time_arr = np.arange(min_ns, min_ns + cnt, dtype=np.int64).astype("M8[ns]") fill_value = np.datetime64("1900-01-01", "ns") create ncfile with time with attached _FillValuewith nc.Dataset("test.nc", mode="w") as ds: ds.createDimension("x", cnt) time = ds.createVariable("time", "<i8", ("x",), fill_value=fill_value) time[:] = time_arr time.units = "nanoseconds since 1970-01-01" normal decodingwith xr.open_dataset("test.nc").load() as xr_ds: print("--- normal decoding ----------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, color="g", label="normal") no decodingwith xr.open_dataset("test.nc", decode_cf=False).load() as xr_ds: print("--- no decoding ----------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values + max_ns, lw=5, color="b", label="raw") do not decode times, this shows how the CFMaskCoder convertsthe array to floating point before it would run CFDatetimeCoderwith xr.open_dataset("test.nc", decode_times=False).load() as xr_ds: print("--- no time decoding ----------------------") print(xr_ds["time"]) do not run CFMaskCoder to show that times will be converted nicelywith CFDatetimeCoderwith xr.open_dataset("test.nc", mask_and_scale=False).load() as xr_ds: print("--- no masking ------------------------------") print(xr_ds["time"]) plt.plot(xr_ds["time"].values.astype(np.int64) + max_ns, lw=2, color="r", label="nomask") plt.legend() ``` MVCE confirmation
Relevant log output
Anything else we need to know?Plot from above code: Xref: #7098, https://github.com/pydata/xarray/issues/7790#issuecomment-1531050846 Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0]
python-bits: 64
OS: Linux
OS-release: 5.14.21-150400.24.60-default
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: ('de_DE', 'UTF-8')
libhdf5: 1.14.0
libnetcdf: 4.9.2
xarray: 2023.4.2
pandas: 2.0.1
numpy: 1.24.2
scipy: 1.10.1
netCDF4: 1.6.3
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.14.2
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.3.1
distributed: 2023.3.1
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.3.0
cupy: 11.6.0
pint: 0.20.1
sparse: None
flox: None
numpy_groupies: None
setuptools: 67.6.0
pip: 23.0.1
conda: None
pytest: 7.2.2
mypy: None
IPython: 8.11.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7817/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |