issues: 1956383344
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1956383344 | I_kwDOAMm_X850nApw | 8358 | Writing to zarr archive fails on resampled dataset | 40218891 | closed | 0 | 1 | 2023-10-23T05:30:36Z | 2023-10-23T15:46:20Z | 2023-10-23T15:46:19Z | NONE | What happened?I am not sure where this belongs: xarray, dask or zarr. When a dataset is resampled to a semi-monthly frequency, the method What did you expect to happen?I think this should work without having to rechunk the result before writing to the archive. Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output```PythonValueError Traceback (most recent call last) Cell In[63], line 4 2 ds = xr.Dataset({"foo": ("time", np.arange(1, 366)), "time": time}).chunk(time=5) 3 dsr = ds.resample(time="SM").mean() ----> 4 dsr.to_zarr('/tmp/foo', mode='w') 5 #dsr.isel(time=slice(0, -1)).to_zarr('/tmp/foo', mode='w') File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/core/dataset.py:2490, in Dataset.to_zarr(self, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs) 2358 """Write dataset contents to a zarr group. 2359 2360 Zarr chunks are determined in the following way: (...) 2486 The I/O user guide, with more details and examples. 2487 """ 2488 from xarray.backends.api import to_zarr -> 2490 return to_zarr( # type: ignore[call-overload,misc] 2491 self, 2492 store=store, 2493 chunk_store=chunk_store, 2494 storage_options=storage_options, 2495 mode=mode, 2496 synchronizer=synchronizer, 2497 group=group, 2498 encoding=encoding, 2499 compute=compute, 2500 consolidated=consolidated, 2501 append_dim=append_dim, 2502 region=region, 2503 safe_chunks=safe_chunks, 2504 zarr_version=zarr_version, 2505 write_empty_chunks=write_empty_chunks, 2506 chunkmanager_store_kwargs=chunkmanager_store_kwargs, 2507 ) File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/api.py:1708, in to_zarr(dataset, store, chunk_store, mode, synchronizer, group, encoding, compute, consolidated, append_dim, region, safe_chunks, storage_options, zarr_version, write_empty_chunks, chunkmanager_store_kwargs) 1706 writer = ArrayWriter() 1707 # TODO: figure out how to properly handle unlimited_dims -> 1708 dump_to_store(dataset, zstore, writer, encoding=encoding) 1709 writes = writer.sync( 1710 compute=compute, chunkmanager_store_kwargs=chunkmanager_store_kwargs 1711 ) 1713 if compute: File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/api.py:1308, in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 1305 if encoder: 1306 variables, attrs = encoder(variables, attrs) -> 1308 store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims) File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/zarr.py:631, in ZarrStore.store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 628 self.set_attributes(attributes) 629 self.set_dimensions(variables_encoded, unlimited_dims=unlimited_dims) --> 631 self.set_variables( 632 variables_encoded, check_encoding_set, writer, unlimited_dims=unlimited_dims 633 ) 634 if self._consolidate_on_close: 635 zarr.consolidate_metadata(self.zarr_group.store) File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/zarr.py:687, in ZarrStore.set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 684 zarr_array = self.zarr_group[name] 685 else: 686 # new variable --> 687 encoding = extract_zarr_variable_encoding( 688 v, raise_on_invalid=check, name=vn, safe_chunks=self._safe_chunks 689 ) 690 encoded_attrs = {} 691 # the magic for storing the hidden dimension data File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/zarr.py:281, in extract_zarr_variable_encoding(variable, raise_on_invalid, name, safe_chunks) 278 if k not in valid_encodings: 279 del encoding[k] --> 281 chunks = _determine_zarr_chunks( 282 encoding.get("chunks"), variable.chunks, variable.ndim, name, safe_chunks 283 ) 284 encoding["chunks"] = chunks 285 return encoding File ~/mambaforge/envs/icec/lib/python3.11/site-packages/xarray/backends/zarr.py:138, in _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name, safe_chunks)
132 raise ValueError(
133 "Zarr requires uniform chunk sizes except for final chunk. "
134 f"Variable named {name!r} has incompatible dask chunks: {var_chunks!r}. "
135 "Consider rechunking using ValueError: Final chunk of Zarr array must be the same size or smaller than the first. Variable named 'foo' has incompatible Dask chunks ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2),).Consider either rechunking using Anything else we need to know?I can also achieve what I want without having to rechunk with
Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.11.6 | packaged by conda-forge | (main, Oct 3 2023, 10:40:35) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.5.5-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.2
xarray: 2023.10.1
pandas: 2.1.1
numpy: 1.24.4
scipy: 1.11.3
netCDF4: 1.6.4
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.16.1
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.10.0
distributed: 2023.10.0
matplotlib: 3.8.0
cartopy: 0.22.0
seaborn: None
numbagg: 0.5.1
fsspec: 2023.10.0
cupy: None
pint: None
sparse: 0.14.0
flox: 0.8.1
numpy_groupies: 0.10.2
setuptools: 68.2.2
pip: 23.3.1
conda: None
pytest: None
mypy: None
IPython: 8.16.1
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8358/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |