issues: 1164454058
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1164454058 | I_kwDOAMm_X85FaCiq | 6345 | `to_zarr` raises `ValueError: Invalid dtype` with `mode='a'` (but not with `mode='w'`) | 62192187 | closed | 0 | 6 | 2022-03-09T21:21:26Z | 2022-05-11T17:35:10Z | 2022-05-11T17:35:10Z | CONTRIBUTOR | What happened?A dataset in which a data variable has ```python import xarray as xr import numpy as np data = np.zeros((2, 3), dtype='|S35')
ds = xr.DataArray(data, name='foo').to_dataset()
ds.to_zarr('test.zarr', mode='w')
Full Traceback```python-traceback --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Input In [4], in <cell line: 1>() ----> 1 ds.to_zarr('test.zarr', mode='a') File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/core/dataset.py:2036, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 2033 if encoding is None: 2034 encoding = {} -> 2036 return to_zarr( 2037 self, 2038 store=store, 2039 chunk_store=chunk_store, 2040 storage_options=storage_options, 2041 mode=mode, 2042 synchronizer=synchronizer, 2043 group=group, 2044 encoding=encoding, 2045 compute=compute, 2046 consolidated=consolidated, 2047 append_dim=append_dim, 2048 region=region, 2049 safe_chunks=safe_chunks, 2050 ) File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1406, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options) 1391 zstore = backends.ZarrStore.open_group( 1392 store=mapper, 1393 mode=mode, (...) 1402 stacklevel=4, # for Dataset.to_zarr() 1403 ) 1405 if mode in ["a", "r+"]: -> 1406 _validate_datatypes_for_zarr_append(dataset) 1407 if append_dim is not None: 1408 existing_dims = zstore.get_dimensions() File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1301, in _validate_datatypes_for_zarr_append(dataset) 1292 raise ValueError( 1293 "Invalid dtype for data variable: {} " 1294 "dtype must be a subtype of number, " (...) 1297 "object".format(var) 1298 ) 1300 for k in dataset.data_vars.values(): -> 1301 check_dtype(k) File ~/miniconda3/envs/pangeo-forge-recipes/lib/python3.9/site-packages/xarray/backends/api.py:1292, in _validate_datatypes_for_zarr_append.<locals>.check_dtype(var) 1283 def check_dtype(var): 1284 if ( 1285 not np.issubdtype(var.dtype, np.number) 1286 and not np.issubdtype(var.dtype, np.datetime64) (...) 1290 ): 1291 # and not re.match('^bytes[1-9]+$', var.dtype.name)): -> 1292 raise ValueError( 1293 "Invalid dtype for data variable: {} " 1294 "dtype must be a subtype of number, " 1295 "datetime, bool, a fixed sized string, " 1296 "a fixed size unicode string or an " 1297 "object".format(var) 1298 ) ValueError: Invalid dtype for data variable: <xarray.DataArray 'foo' (dim_0: 2, dim_1: 3)> array([[b'', b'', b''], [b'', b'', b'']], dtype='|S35') Dimensions without coordinates: dim_0, dim_1 dtype must be a subtype of number, datetime, bool, a fixed sized string, a fixed size unicode string or an object ```What did you expect to happen?I would expect the behavior of Minimal Complete Verifiable ExampleSee What Happened? section above Relevant log outputSee What Happened? section above Anything else we need to know?No response Environment``` INSTALLED VERSIONS commit: None python: 3.9.10 | packaged by conda-forge | (main, Feb 1 2022, 21:28:27) [Clang 11.1.0 ] python-bits: 64 OS: Darwin OS-release: 21.0.1 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.3.0 pandas: 1.4.1 numpy: 1.22.2 scipy: 1.8.0 netCDF4: 1.5.8 pydap: installed h5netcdf: 999 h5py: 3.6.0 Nio: None zarr: 2.11.0 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: 0.9.8.5 iris: None bottleneck: None dask: 2022.02.1 distributed: 2022.2.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: None fsspec: 2022.02.0 cupy: None pint: None sparse: None setuptools: 59.8.0 pip: 22.0.4 conda: None pytest: 6.2.5 IPython: 8.1.1 sphinx: None ``` cc @rabernat |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/6345/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |