id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 608923222,MDU6SXNzdWU2MDg5MjMyMjI=,4014,Add NetCDF3 dtype coercion for unsigned integer types,1700203,closed,0,,,2,2020-04-29T09:52:08Z,2020-05-20T17:08:24Z,2020-05-20T17:08:24Z,CONTRIBUTOR,,,,"`xr.Dataset.to_netcdf` does not seem to support writing data with unsigned integer dtypes, `uint32`, `uint64` etc.. This seems to be the case for both scipy-based output formats, `NETCDF3_CLASSIC` and `NETCDF3_64BIT`. It seems like dtype coercions for `int64` and `bool` are done automatically for NetCDF3 in the [`xarray.netcdf3`](https://github.com/pydata/xarray/blob/33a66d6380c26a59923922ee11e8ffcf0b4f379f/xarray/backends/netcdf3.py#L31) module. Shouldn't data of signed dtypes then *also* be coerced, i.e. to their signed equivalent? #### MCVE Code Sample ```python import numpy as np import xarray as xr da = xr.DataArray(np.array([1,2,3], dtype='uint64')) # The following all fail: da.to_netcdf(""foo"") # default format: scipy NETCDF3_CLASSIC da.to_netcdf(""bar"", format='NETCDF3_64BIT') da.astype('uint32').to_netcdf(""baz"") da.astype('uint16').to_netcdf(""spam"") # This works: da.astype('int64').to_netcdf(""working64"") # is coerced da.astype('int32').to_netcdf(""working32"") # works natively ``` *Importantly,* this is with the netcdf4 python package **not** being installed, in which case that package would be used for writing rather than scipy's netcdf. #### Expected Output NetCDF3 file is written with an appropriately coerced data format, e.g. as done with `int64`. Alternatively, writing data fails for _all_ dtypes that would natively be unsupported, including `int64` and `bool`. #### Problem Description Given that the infrastructure for coercion is [already in place](https://github.com/pydata/xarray/blob/33a66d6380c26a59923922ee11e8ffcf0b4f379f/xarray/backends/netcdf3.py#L37-L57), it seems more consistent to me to apply coercion to all cases where it would lead to `to_netcdf` method calls _succeeding_ rather than failing, not only to `int64` and `bool`. Ideally, coercion would happen towards another unsigned integer type. However, writing `uint32` seems not to be possible, so it's not a 64bit/32bit issue. While the [NetCDF Format Specification](https://www.unidata.ucar.edu/software/netcdf/docs/file_format_specifications.html#classic_format_spec) declares as only unsigned integer type `NON_NEG`, which I presume to be equivalent to `uint32`, writing unsigned integers seems not possible via scipy's NetCDF3 writer. Thus, the only viable coercion, as far as I see, would be to signed equivalents. I see no new cast safety implications here, because the existing `coerce_nc3_dtype` function already checks if the original and cast arrays compare equivalently. #### Versions
Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 (default, Dec 30 2019, 19:38:26) [Clang 11.0.0 (clang-1100.0.33.16)] python-bits: 64 OS: Darwin OS-release: 19.3.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.1 pandas: 1.0.1 numpy: 1.18.3 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.10.1 distributed: 2.10.0 matplotlib: 3.1.3 cartopy: None seaborn: None numbagg: None setuptools: 45.1.0 pip: 20.0.2 conda: None pytest: 5.3.5 IPython: 7.12.0 sphinx: 2.4.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4014/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue