issues: 686608969
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
686608969 | MDU6SXNzdWU2ODY2MDg5Njk= | 4380 | Error when rechunking from Zarr store | 6130352 | closed | 0 | 5 | 2020-08-26T20:53:05Z | 2023-11-12T05:50:29Z | 2023-11-12T05:50:29Z | NONE | My assumption for this is that it should be possible to:
However I see this behavior instead: ```python import xarray as xr import dask.array as da ds = xr.Dataset(dict( x=xr.DataArray(da.random.random(size=100, chunks=10), dims='d1') )) Write the storeds.to_zarr('/tmp/ds1.zarr', mode='w') Read it out, rechunk it, and attempt to write it againxr.open_zarr('/tmp/ds1.zarr').chunk(chunks=dict(d1=20)).to_zarr('/tmp/ds2.zarr', mode='w') ValueError: Final chunk of Zarr array must be the same size or smaller than the first.
Specified Zarr chunk encoding['chunks']=(10,), for variable named 'x' but (20, 20, 20, 20, 20)
in the variable's Dask chunks ((20, 20, 20, 20, 20),) is incompatible with this encoding.
Consider either rechunking using Full trace--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-122-e185759d81c5> in <module> ----> 1 xr.open_zarr('/tmp/ds1.zarr').chunk(chunks=dict(d1=20)).to_zarr('/tmp/ds2.zarr', mode='w') /opt/conda/lib/python3.7/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1656 compute=compute, 1657 consolidated=consolidated, -> 1658 append_dim=append_dim, 1659 ) 1660 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute, consolidated, append_dim) 1351 writer = ArrayWriter() 1352 # TODO: figure out how to properly handle unlimited_dims -> 1353 dump_to_store(dataset, zstore, writer, encoding=encoding) 1354 writes = writer.sync(compute=compute) 1355 /opt/conda/lib/python3.7/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1126 variables, attrs = encoder(variables, attrs) 1127 -> 1128 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) 1129 1130 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 411 self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims) 412 self.set_variables( --> 413 variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims 414 ) 415 /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 466 # new variable 467 encoding = extract_zarr_variable_encoding( --> 468 v, raise_on_invalid=check, name=vn 469 ) 470 encoded_attrs = {} /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in extract_zarr_variable_encoding(variable, raise_on_invalid, name) 214 215 chunks = _determine_zarr_chunks( --> 216 encoding.get("chunks"), variable.chunks, variable.ndim, name 217 ) 218 encoding["chunks"] = chunks /opt/conda/lib/python3.7/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name) 154 if dchunks[-1] > zchunk: 155 raise ValueError( --> 156 "Final chunk of Zarr array must be the same size or " 157 "smaller than the first. " 158 f"Specified Zarr chunk encoding['chunks']={enc_chunks_tuple}, " ValueError: Final chunk of Zarr array must be the same size or smaller than the first. Specified Zarr chunk encoding['chunks']=(10,), for variable named 'x' but (20, 20, 20, 20, 20) in the variable's Dask chunks ((20, 20, 20, 20, 20),) is incompatible with this encoding. Consider either rechunking using `chunk()` or instead deleting or modifying `encoding['chunks']`. Overwriting chunks on
Does Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.4.0-42-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: None xarray: 0.16.0 pandas: 1.0.5 numpy: 1.19.0 scipy: 1.5.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.21.0 distributed: 2.21.0 matplotlib: 3.3.0 cartopy: None seaborn: 0.10.1 numbagg: None pint: None setuptools: 47.3.1.post20200616 pip: 20.1.1 conda: 4.8.2 pytest: 5.4.3 IPython: 7.15.0 sphinx: 3.2.1 |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4380/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |