issues: 651945063
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
651945063 | MDU6SXNzdWU2NTE5NDUwNjM= | 4205 | Chunking causes unrelated non-dimension coordinate to become a dask array | 1053153 | open | 0 | 2 | 2020-07-07T02:35:15Z | 2022-07-12T15:25:05Z | CONTRIBUTOR | What happened: Rechunking along an independent dimension causes unrelated non-dimension coordinates to become dask arrays. The dimension coordinates do not seem affected. I can stick in a synchronous compute on the coordinate to recover, but wanted to be sure this was the expected behavior. What you expected to happen: Chunking along an unrelated dimension should not affect unrelated non-dimension coordinates. Minimal Complete Verifiable Example: ```python import xarray as xr import dask.array as da def print_coords(a, title): print() print(title) for dim in ['x', 'y', 'b']: if dim in a.dims or dim in a.coords: print('dim:', dim, 'type:', type(a.coords[dim].data)) arr = xr.DataArray(da.zeros((20, 20), chunks=10), dims=('x', 'y'), coords={'b': ('y', range(100,120)), 'x': range(20), 'y': range(20)}) print_coords(arr, 'Original') The following line rechunks independently of b or y.Removing this line allows the code to succeed.arr = arr.chunk({'x': 5}) print_coords(arr, 'After chunking') arr = arr.sel(y=2) print_coords(arr, 'After selection') print()
print('Scalar values:')
print('y=', arr.coords['y'].item())
print('b=', arr.coords['b'].item()) # Sad Panda
After chunking dim: x type: <class 'numpy.ndarray'> dim: y type: <class 'numpy.ndarray'> dim: b type: <class 'dask.array.core.Array'> After selection dim: x type: <class 'numpy.ndarray'> dim: y type: <class 'numpy.ndarray'> dim: b type: <class 'dask.array.core.Array'> Scalar values: y= 2 <stack trace elided> NotImplementedError: 'item' is not yet a valid method on dask arrays ``` Environment: Output of <tt>xr.show_versions()</tt>INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jun 1 2020, 18:57:50) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.19.112+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.15.1 pandas: 1.0.5 numpy: 1.18.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: 2.4.0 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.19.0 distributed: 2.19.0 matplotlib: 3.2.2 cartopy: None seaborn: None numbagg: None setuptools: 49.1.0.post20200704 pip: 20.1.1 conda: 4.8.3 pytest: 5.4.3 IPython: 7.16.1 sphinx: None |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4205/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |