issues: 1400949778
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1400949778 | I_kwDOAMm_X85TgMwS | 7139 | xarray.open_dataset has issues if the dataset returned by the backend contains a multiindex | 21131639 | closed | 0 | 5 | 2022-10-07T10:19:36Z | 2022-10-12T20:17:28Z | 2022-10-12T20:17:28Z | CONTRIBUTOR | What happened?As a follow up of this comment: https://github.com/pydata/xarray/issues/6752#issuecomment-1236756285 I'm currently trying to implement a custom I'm using the following two functions to convert the dataset to a NetCDF compatible version and back again: https://github.com/pydata/xarray/issues/1077#issuecomment-1101505074. Here is a small code example: Creating the dataset```python import xarray as xr import pandas def create_multiindex(**kwargs): return pandas.MultiIndex.from_arrays(list(kwargs.values()), names=kwargs.keys()) dataset = xr.Dataset() dataset.coords["observation"] = ["A", "B"] dataset.coords["wavelength"] = [0.4, 0.5, 0.6, 0.7] dataset.coords["stokes"] = ["I", "Q"] dataset["measurement"] = create_multiindex( observation=["A", "A", "B", "B"], wavelength=[0.4, 0.5, 0.6, 0.7], stokes=["I", "Q", "I", "I"], ) ``` Saving as NetCDF
And loading again
Custom BackendWhile the manual patching for saving is currently still required, I tried to at least work around the added function call in ```python registered as netcdf4-multiindex backend in setup.pyclass MultiindexNetCDF4BackendEntrypoint(NetCDF4BackendEntrypoint): def open_dataset(self, args, handle_multiindex=True, kwargs): ds = super().open_dataset(args, **kwargs)
``` The error```python
File ~/.local/share/virtualenvs/test-oePfdNug/lib/python3.8/site-packages/xarray/core/variable.py:2795, in IndexVariable.data(self, data) 2793 @Variable.data.setter # type: ignore[attr-defined] 2794 def data(self, data): -> 2795 raise ValueError( 2796 f"Cannot assign to the .data attribute of dimension coordinate a.k.a IndexVariable {self.name!r}. " 2797 f"Please use DataArray.assign_coords, Dataset.assign_coords or Dataset.assign as appropriate." 2798 ) ValueError: Cannot assign to the .data attribute of dimension coordinate a.k.a IndexVariable 'measurement'. Please use DataArray.assign_coords, Dataset.assign_coords or Dataset.assign as appropriate. ``` but this works: ```python
So I'm guessing What did you expect to happen?I expected that it doesn't matter wheter Minimal Complete Verifiable Example
MVCE confirmation
Relevant log outputNo response Anything else we need to know?I'm also open to other suggestions how I could simplify the usage of multiindices, maybe there is an approach that doesn't require a custom backend at all? Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.10 (default, Jan 28 2022, 09:41:12)
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.10.102.1-microsoft-standard-WSL2
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.5
libnetcdf: 4.6.3
xarray: 2022.9.0
pandas: 1.5.0
numpy: 1.23.3
scipy: 1.9.1
netCDF4: 1.5.4
pydap: None
h5netcdf: None
h5py: 3.7.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.2
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.6.0
cartopy: 0.19.0.post1
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 65.3.0
pip: 22.2.2
conda: None
pytest: 7.1.3
IPython: 8.5.0
sphinx: 4.5.0
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7139/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |