home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 863332419

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
863332419 MDU6SXNzdWU4NjMzMzI0MTk= 5197 allow you to raise error on missing zarr chunks with open_dataset/open_zarr 4801430 closed 0     1 2021-04-21T00:00:03Z 2023-11-24T22:14:18Z 2023-11-24T22:14:18Z CONTRIBUTOR      

Is your feature request related to a problem? Please describe. Currently if a zarr store has a missing chunk, it is treaded as all missing. This is an upstream functionality but one for which there may soon be a kwarg allowing you to instead raise an error in these instances (https://github.com/zarr-developers/zarr-python/pull/489). This is valuable in situations where you would like to distinguish intentional NaN data from I/O errors that caused you to not write some chunks. Here's an example of a problematic case in this situation (courtesy of @delgadom ):

python import xarray as xr import numpy as np xr.Dataset({'myarr': (('x', 'y'), [[0., np.nan], [2., 3.]]), 'x': [0, 1], 'y': [0, 1]}).chunk({'x': 1, 'y': 1}).to_zarr('myzarr.zarr'); print('\n\ndata read into xarray\n' + '-'*30) print(xr.open_zarr('myzarr.zarr').compute().myarr) print('\n\nstructure of zarr store\n' + '-'*30) ! ls -R myzarr.zarr print('\n\nremove a chunk\n' + '-'*30 + '\nrm myzarr.zarr/myarr/1.0') ! rm myzarr.zarr/myarr/1.0 print('\n\ndata read into xarray\n' + '-'*30) print(xr.open_zarr('myzarr.zarr').compute().myarr)

This prints:

``` data read into xarray


<xarray.DataArray 'myarr' (x: 2, y: 2)> array([[ 0., nan], [ 2., 3.]]) Coordinates: * x (x) int64 0 1 * y (y) int64 0 1 structure of zarr store


myzarr.zarr: myarr x y myzarr.zarr/myarr: 0.0 0.1 1.0 1.1 myzarr.zarr/x: 0 myzarr.zarr/y: 0 remove a chunk


rm myzarr.zarr/myarr/1.0 data read into xarray


<xarray.DataArray 'myarr' (x: 2, y: 2)> array([[ 0., nan], [nan, 3.]]) Coordinates: * x (x) int64 0 1 * y (y) int64 0 1 ```

Describe the solution you'd like I'm not sure where a kwarg to the __init__ method of a zarr Array object would come into play within open_zarr or open_dataset (once https://github.com/zarr-developers/zarr-python/pull/489 is merged), but I figured I'd ask this question to see if anyone could point me in the right direction and to get ready for when that zarr feature exists. Happy to file a PR once I know where I'm looking. Couldn't figure it out with some initial browsing

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5197/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  not_planned 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.541ms · About: xarray-datasette