issues: 717410970
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
717410970 | MDU6SXNzdWU3MTc0MTA5NzA= | 4496 | Flexible backends - Harmonise zarr chunking with other backends chunking | 35919497 | closed | 0 | 35919497 | 7 | 2020-10-08T14:43:23Z | 2020-12-10T10:51:09Z | 2020-12-10T10:51:09Z | COLLABORATOR | Is your feature request related to a problem? Please describe. In #4309 we proposed to separate xarray - backend tasks, more or less in this way: - Backend returns a dataset - xarray manage chunks and cache. With the changes in open_dataset to support also zarr (#4187 ), we introduced a slightly different behavior for zarr chunking with respect the other backends. Behavior of all the backends except zar - if chunk == {} or 'auto': it uses dask and only one chunk per variable - if the user defines chunks for not all the dimensions, along these dimensions it uses only one chunk: ```python
Describe the solution you'd like We could extend easily zarr behavior to all the backends (which, for now, don't use the field variable.encodings['chunks']): if no chunks are defined in encoding, we use as default the dimension size, otherwise, we use the encoded chunks. So for now we are not going to change any external behavior, but if needed the other backends can use this interface. I have some additional notes:
One last question:
- In the new interface of open_dataset there is a new key, imported from open_zarr: |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4496/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |