issues: 1935984485
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1935984485 | I_kwDOAMm_X85zZMdl | 8290 | Potential performance optimization for Zarr backend | 1197350 | closed | 0 | 0 | 2023-10-10T18:41:19Z | 2023-10-13T16:38:58Z | 2023-10-13T16:38:58Z | MEMBER | What is your issue?We have identified an inefficiency in the way the When accessing the array, the parent group of the array is read and used to open a new Zarr array. This is a relatively metadata-intensive operation for Zarr. It requires reading both the group metadata and the array metadata. Because of how this wrapper works, these operations currently happen every time data is read from the array. If we have a dask array wrapping the zarr array with thousands of chunks, these metadata operations will happen within every single task. For high latency stores, this is really bad. Instead, we should just reference the |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8290/reactions", "total_count": 6, "+1": 4, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 2, "eyes": 0 } |
completed | 13221727 | issue |