issues: 331415995
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
331415995 | MDU6SXNzdWUzMzE0MTU5OTU= | 2225 | Zarr Backend: check for non-uniform chunks is too strict | 2443309 | closed | 0 | 3 | 2018-06-12T02:36:05Z | 2018-06-13T05:51:36Z | 2018-06-13T05:51:36Z | MEMBER | I think the following block of code is more strict than either dask or zarr requires: It should be possible to have uneven chunks in the last position of multiple dimensions in a zarr dataset. Code Sample, a copy-pastable example if possible```python In [1]: import xarray as xr In [2]: import dask.array as dsa In [3]: da = xr.DataArray(dsa.random.random((8, 7, 11), chunks=(3, 3, 3)), dims=('x', 'y', 't')) In [4]: da Out[4]: <xarray.DataArray 'da.random.random_sample-1aed3ea2f9dd784ec947cb119459fa56' (x: 8, y: 7, t: 11)> dask.array<shape=(8, 7, 11), dtype=float64, chunksize=(3, 3, 3)> Dimensions without coordinates: x, y, t In [5]: da.data.chunks Out[5]: ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) In [6]: da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr')
/Users/jhamman/anaconda/bin/ipython:1: FutureWarning: the order of the arguments on DataArray.to_dataset has changed; you now need to supply ValueError Traceback (most recent call last) <ipython-input-7-32fa9a7d0276> in <module>() ----> 1 da.to_dataset('varname').to_zarr('/Users/jhamman/workdir/test_chunks.zarr') ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in to_zarr(self, store, mode, synchronizer, group, encoding, compute) 1185 from ..backends.api import to_zarr 1186 return to_zarr(self, store=store, mode=mode, synchronizer=synchronizer, -> 1187 group=group, encoding=encoding, compute=compute) 1188 1189 def unicode(self): ~/anaconda/lib/python3.6/site-packages/xarray/backends/api.py in to_zarr(dataset, store, mode, synchronizer, group, encoding, compute) 856 # I think zarr stores should always be sync'd immediately 857 # TODO: figure out how to properly handle unlimited_dims --> 858 dataset.dump_to_store(store, sync=True, encoding=encoding, compute=compute) 859 860 if not compute: ~/anaconda/lib/python3.6/site-packages/xarray/core/dataset.py in dump_to_store(self, store, encoder, sync, encoding, unlimited_dims, compute) 1073 1074 store.store(variables, attrs, check_encoding, -> 1075 unlimited_dims=unlimited_dims) 1076 if sync: 1077 store.sync(compute=compute) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in store(self, variables, attributes, args, kwargs) 341 def store(self, variables, attributes, args, kwargs): 342 AbstractWritableDataStore.store(self, variables, attributes, --> 343 *args, kwargs) 344 345 def sync(self, compute=True): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, unlimited_dims) 366 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 367 self.set_variables(variables, check_encoding_set, --> 368 unlimited_dims=unlimited_dims) 369 370 def set_attributes(self, attributes): ~/anaconda/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, unlimited_dims) 403 check = vn in check_encoding_set 404 target, source = self.prepare_variable( --> 405 name, v, check, unlimited_dims=unlimited_dims) 406 407 self.writer.add(source, target) ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 325 326 encoding = _extract_zarr_variable_encoding( --> 327 variable, raise_on_invalid=check_encoding) 328 329 encoded_attrs = OrderedDict() ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _extract_zarr_variable_encoding(variable, raise_on_invalid) 181 182 chunks = _determine_zarr_chunks(encoding.get('chunks'), variable.chunks, --> 183 variable.ndim) 184 encoding['chunks'] = chunks 185 return encoding ~/anaconda/lib/python3.6/site-packages/xarray/backends/zarr.py in _determine_zarr_chunks(enc_chunks, var_chunks, ndim)
87 "Zarr requires uniform chunk sizes excpet for final chunk."
88 " Variable %r has incompatible chunks. Consider "
---> 89 "rechunking using ValueError: Zarr requires uniform chunk sizes excpet for final chunk. Variable ((3, 3, 2), (3, 3, 1), (3, 3, 3, 2)) has incompatible chunks. Consider rechunking using Problem description[this should explain why the current behavior is a problem and why the expected output is a better solution.] Expected OutputIIUC, Zarr allows multiple dims to have uneven chunks, so long as they are all in the last position: ```Python In [9]: import zarr In [10]: z = zarr.zeros((8, 7, 11), chunks=(3, 3, 3), dtype='i4') In [11]: z.chunks Out[11]: (3, 3, 3) ``` Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2225/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |