home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 2093790208

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2093790208 I_kwDOAMm_X858zLQA 8641 Using netcdf3 with datetime64[ns] quickly overflows int32 32731672 closed 0     5 2024-01-22T12:18:50Z 2024-02-05T08:54:11Z 2024-02-05T08:54:11Z CONTRIBUTOR      

What happened?

While trying to store datetimes into netcdf, ran into the problem of overflowing int32 when datetimes include nanoseconds.

What did you expect to happen?

First surprised that my data did not store successfully, but after investigating, come to understand that the netcdf3 format is quite limited. It would probably make sense to include some warning when using datetime64 when storing to netcdf3.

Minimal Complete Verifiable Example

Python import numpy as np import xarray as xr import datetime dataset = xr.combine_by_coords( [ xr.Dataset( {"value": (["step"], [0.0])}, coords={ "step": np.array( [datetime.datetime(2000, 1, 1, 0, 0)], dtype="datetime64[ns]" ), }, ), xr.Dataset( {"value": (["step"], [0.0])}, coords={ "step": np.array( [datetime.datetime(2000, 1, 1, 1, 0, 0, 1)], dtype="datetime64[ns]", ), }, ), ] ) dataset.to_netcdf("./out.nc", engine="scipy")

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

``` File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/core/dataset.py:2303, in Dataset.to_netcdf(self, path , mode, format, group, engine, encoding, unlimited_dims, compute, invalid_netcdf) 2300 encoding = {} 2301 from xarray.backends.api import to_netcdf -> 2303 return to_netcdf( # type: ignore # mypy cannot resolve the overloads:( 2304 self, 2305 path, 2306 mode=mode, 2307 format=format, 2308 group=group, 2309 engine=engine, 2310 encoding=encoding, 2311 unlimited_dims=unlimited_dims, 2312 compute=compute, 2313 multifile=False, 2314 invalid_netcdf=invalid_netcdf, 2315 )

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/api.py:1315, in to_netcdf(dataset, path_or_f ile, mode, format, group, engine, encoding, unlimited_dims, compute, multifile, invalid_netcdf) 1310 # TODO: figure out how to refactor this logic (here and in save_mfdataset) 1311 # to avoid this mess of conditionals 1312 try: 1313 # TODO: allow this work (setting up the file for writing array data) 1314 # to be parallelized with dask -> 1315 dump_to_store( 1316 dataset, store, writer, encoding=encoding, unlimited_dims=unlimited_dims 1317 ) 1318 if autoclose: 1319 store.close()

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/api.py:1362, in dump_to_store(dataset, store , writer, encoder, encoding, unlimited_dims) 1359 if encoder: 1360 variables, attrs = encoder(variables, attrs) -> 1362 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/common.py:352, in AbstractWritableDataStore. store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 349 if writer is None: 350 writer = ArrayWriter() --> 352 variables, attributes = self.encode(variables, attributes) 354 self.set_attributes(attributes) 355 self.set_dimensions(variables, unlimited_dims=unlimited_dims)

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/common.py:442, in WritableCFDataStore.encode (self, variables, attributes) 438 def encode(self, variables, attributes): 439 # All NetCDF files get CF encoded by default, without this attempting 440 # to write times, for example, would fail. 441 variables, attributes = cf_encoder(variables, attributes) --> 442 variables = {k: self.encode_variable(v) for k, v in variables.items()} 443 attributes = {k: self.encode_attribute(v) for k, v in attributes.items()} 444 return variables, attributes

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/common.py:442, in <dictcomp>(.0) 438 def encode(self, variables, attributes): 439 # All NetCDF files get CF encoded by default, without this attempting 440 # to write times, for example, would fail. 441 variables, attributes = cf_encoder(variables, attributes) --> 442 variables = {k: self.encode_variable(v) for k, v in variables.items()} 443 attributes = {k: self.encode_attribute(v) for k, v in attributes.items()} 444 return variables, attributes

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/scipy_.py:213, in ScipyDataStore.encode_vari able(self, variable) 212 def encode_variable(self, variable): --> 213 variable = encode_nc3_variable(variable) 214 return variable

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/netcdf3.py:114, in encode_nc3_variable(var) 112 var = coder.encode(var) 113 data = _maybe_prepare_times(var) --> 114 data = coerce_nc3_dtype(data) 115 attrs = encode_nc3_attrs(var.attrs) 116 return Variable(var.dims, data, attrs, var.encoding)

File ~/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/xarray/backends/netcdf3.py:68, in coerce_nc3_dtype(arr) 66 cast_arr = arr.astype(new_dtype) 67 if not (cast_arr == arr).all(): ---> 68 raise ValueError( 69 f"could not safely cast array from dtype {dtype} to {new_dtype}" 70 ) 71 arr = cast_arr 72 return arr

ValueError: could not safely cast array from dtype int64 to int32

```

Anything else we need to know?

No response

Environment

/home/eivind/.local/share/virtualenvs/ert-0_7in3Ct/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptool s is replacing distutils. warnings.warn("Setuptools is replacing distutils.") INSTALLED VERSIONS ------------------ commit: None python: 3.11.4 (main, Dec 7 2023, 15:43:41) [GCC 12.3.0] python-bits: 64 OS: Linux OS-release: 6.2.0-39-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.3-development xarray: 2023.10.1 pandas: 2.1.1 numpy: 1.26.1 scipy: 1.11.3 netCDF4: 1.6.5 pydap: None h5netcdf: None h5py: 3.10.0 Nio: None zarr: None cftime: 1.6.3 nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.8.0 cartopy: None seaborn: 0.13.1 numbagg: None fsspec: None cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 63.4.3 pip: 23.3.1 conda: None pytest: 7.4.4 mypy: 1.6.1 IPython: 8.17.2 sphinx: 7.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8641/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.61ms · About: xarray-datasette