id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1402168223,I_kwDOAMm_X85Tk2Of,7148,Concatenate using Multiindex cannot be unstacked anymore,14276158,open,0,,,3,2022-10-09T06:23:06Z,2022-10-10T08:16:38Z,,CONTRIBUTOR,,,,"### What happened? When trying to concatenate data using a Pandas MultiIndex and then unstack it to get two independent dimensions (e.g. for varying different parameters in a simulation), the `unstack` errors. I have seen different errors with different data (MVE errors with `ValueError: IndexVariable objects must be 1-dimensional`, but my data errors with `ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'concat_dim' (2 conflicting indexes)`). One hint at the bug might be that `conc._indexes` shows more indexes then `display(conc)`. ### What did you expect to happen? Originally (I think it was v2022.3.0) , it used to unstack neatly into the two levels of the multiindex as separate dimensions. ### Minimal Complete Verifiable Example ```Python import xarray as xr import numpy as np import pandas as pd ds = xr.Dataset(data_vars={""a"": ((""dim1"", ""dim2""), np.arange(16).reshape(4,4))}, coords={""dim1"": list(range(4)), ""dim2"": list(range(2,6))}) dslist = [ds for i in range(6)] arrays = [ [""bar"", ""bar"", ""baz"", ""baz"", ""foo"", ""foo""], [""one"", ""two"", ""one"", ""two"", ""one"", ""two""], ] mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=[""first"", ""second""]) conc = xr.concat(dslist, dim=mindex) conc.unstack(""concat_dim"") # this errors conc = xr.concat(dslist, dim='concat_dim') conc = conc.assign_coords(dict(concat_dim=mindex)).unstack(""concat_dim"") # this does not ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output _No response_ ### Anything else we need to know? _No response_ ### Environment
[Skip to left side bar](https://jupyterhub.dkrz.de/user/b381219/levante-spawner-advanced/lab/tree/home/b/b381219/software/phd_scripts/jupyter/Test.ipynb#) > / /phd_scripts/jupyter/ Name Last Modified import xarray as xr import numpy as np import pandas as pd ​ ds = xr.Dataset(data_vars={""a"": ((""dim1"", ""dim2""), np.arange(16).reshape(4,4))}, coords={""dim1"": list(range(4)), ""dim2"": list(range(2,6))}) dslist = [ds for i in range(6)] ​ arrays = [ [""bar"", ""bar"", ""baz"", ""baz"", ""foo"", ""foo""], [""one"", ""two"", ""one"", ""two"", ""one"", ""two""], ] mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=[""first"", ""second""]) ​ conc = xr.concat(dslist, dim=mindex) conc.unstack(""concat_dim"") --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In [24], line 15 12 mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=[""first"", ""second""]) 14 conc = xr.concat(dslist, dim=mindex) ---> 15 conc.unstack(""concat_dim"") File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4870, in Dataset.unstack(self, dim, fill_value, sparse) 4866 result = result._unstack_full_reindex( 4867 d, stacked_indexes[d], fill_value, sparse 4868 ) 4869 else: -> 4870 result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse) 4871 return result File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4706, in Dataset._unstack_once(self, dim, index_and_vars, fill_value, sparse) 4703 else: 4704 fill_value_ = fill_value -> 4706 variables[name] = var._unstack_once( 4707 index=clean_index, 4708 dim=dim, 4709 fill_value=fill_value_, 4710 sparse=sparse, 4711 ) 4712 else: 4713 variables[name] = var File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1764, in Variable._unstack_once(self, index, dim, fill_value, sparse) 1759 # Indexer is a list of lists of locations. Each list is the locations 1760 # on the new dimension. This is robust to the data being sparse; in that 1761 # case the destinations will be NaN / zero. 1762 data[(..., *indexer)] = reordered -> 1764 return self._replace(dims=new_dims, data=data) File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1017, in Variable._replace(self, dims, data, attrs, encoding) 1015 if encoding is _default: 1016 encoding = copy.copy(self._encoding) -> 1017 return type(self)(dims, data, attrs, encoding, fastpath=True) File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:2776, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath) 2774 super().__init__(dims, data, attrs, encoding, fastpath) 2775 if self.ndim != 1: -> 2776 raise ValueError(f""{type(self).__name__} objects must be 1-dimensional"") 2778 # Unlike in Variable, always eagerly load values into memory 2779 if not isinstance(self._data, PandasIndexingAdapter): ValueError: IndexVariable objects must be 1-dimensional conc = xr.concat(dslist, dim='concat_dim') conc = conc.assign_coords(dict(concat_dim=index)).unstack(""concat_dim"") conc xarray.Dataset Dimensions: first: 3second: 2dim1: 4dim2: 4 Coordinates: first (first) object 'bar' 'baz' 'foo' second (second) object 'one' 'two' dim1 (dim1) int64 0 1 2 3 dim2 (dim2) int64 2 3 4 5 Data variables: a (dim1, dim2, first, second) int64 0 0 0 0 0 0 1 ... 15 15 15 15 15 15 Attributes: (0) xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.25.1.el8_4.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.0 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.1 pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.9.2 distributed: 2022.9.2 matplotlib: 3.6.0 cartopy: 0.21.0 seaborn: None numbagg: None fsspec: 2022.8.2 cupy: None pint: 0.19.2 sparse: None flox: None numpy_groupies: None setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: None IPython: 8.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7148/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1147894038,PR_kwDOAMm_X84zVXuC,6292,Amended docs on how to add a new backend,14276158,closed,0,,,1,2022-02-23T10:11:06Z,2022-02-23T17:54:53Z,2022-02-23T17:54:47Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6292,"When trying to install a new xarray backend with poetry, I noticed that it should be `xarray.backends` instead of `xarray_backends`.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6292/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1121761078,PR_kwDOAMm_X84x9VPA,6232,Amended docstring to reflect the actual behaviour of Dataset.map,14276158,closed,0,,,4,2022-02-02T10:39:30Z,2022-02-23T09:51:41Z,2022-02-23T09:51:27Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6232,"In [MetPy/pull#2312](https://github.com/MetPy/pull/2312), I noticed that the behavior of `Dataset.map`'s kwarg `keep_attrs` was not in line with its docstring. This is because in xarray 0.16.2 (pydata/xarray#3595 & pydata/xarray#4195) this behavior was changed. In short, `keep_attrs=True` now copies both, the Dataset's and the variables' attributes and adds them to the new objects - preventing any change to them by the mapped function. In contrast, `keep_attrs= False` discards the Dataset's attributes and does not touch the variables' attributes, enabling the mapped function to modify them. Here, I propose an update to the `keep_attrs` docstring of `Dataset.map` which would more accurately reflect its current behavior. Please feel free to provide alternative formulations if you feel like this one misses the mark.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6232/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull