issues: 868352536
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
868352536 | MDU6SXNzdWU4NjgzNTI1MzY= | 5219 | Zarr encoding attributes persist after slicing data, raising error on `to_zarr` | 4801430 | open | 0 | 9 | 2021-04-27T01:34:52Z | 2022-12-06T16:16:20Z | CONTRIBUTOR | What happened:
Opened a dataset using What you expected to happen:
The file would save without needing to explicitly modify any Minimal Complete Verifiable Example: ```python ds = xr.Dataset({"data": (("dimA", ), [10, 20, 30, 40])}, coords={"dimA": [1, 2, 3, 4]}) ds = ds.chunk({"dimA": 2}) ds.to_zarr("test.zarr", consolidated=True, mode="w") ds2 = xr.open_zarr("test.zarr", consolidated=True).sel(dimA=[1,3]).persist() ds2.to_zarr("test2.zarr", consolidated=True, mode="w") ``` This raises:
Not sure if there is a good way around this (or perhaps this is even desired behavior?), but figured I would flag it as it seemed unexpected and took us a second to diagnose. Once you've loaded the data from a zarr store, I feel like the default behavior should probably be to forget the encodings used to save that zarr, treating the in-memory dataset object just like any other in-memory dataset object that could have been loaded from any source. But maybe I'm in the minority or missing some nuance about why you'd want the encoding to hang around. Environment: ``` INSTALLED VERSIONS commit: None python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.89+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: C.UTF-8 LANG: C.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.17.0 pandas: 1.2.4 numpy: 1.20.2 scipy: 1.6.2 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 3.2.1 Nio: None zarr: 2.7.1 cftime: 1.2.1 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.2.2 cfgrib: 0.9.9.0 iris: 3.0.1 bottleneck: 1.3.2 dask: 2021.04.1 distributed: 2021.04.1 matplotlib: 3.4.1 cartopy: 0.19.0 seaborn: 0.11.1 numbagg: None pint: 0.17 setuptools: 49.6.0.post20210108 pip: 21.0.1 conda: None pytest: 6.2.3 IPython: 7.22.0 sphinx: 3.5.4 ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/5219/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |