id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1384465119,I_kwDOAMm_X85ShULf,7076,Can't unstack concatenated DataArrays,22566757,open,0,,,9,2022-09-24T00:50:09Z,2023-02-21T17:11:35Z,,CONTRIBUTOR,,,,"### What happened? I had a collection of `DataArray`s with a stacked dimension (dimension whose corresponding index is a `MultiIndex`). I concatenated them into a single `DataArray`, then tried to unstack the stacked dimension, which failed. Performing the operations in the other order works (unstacking each `DataArray`, then concatenating the unstacked arrays). ### What did you expect to happen? I expected that concatenating the arrays then unstacking them would produce the same array as unstacking them then concatenating them, but with the possibility of saving the intermediate concatenated-but-still-stacked `DataArray` for later use as a template. ### Minimal Complete Verifiable Example ```Python import pandas as pd import xarray index = pd.MultiIndex.from_product([range(3), range(5)]) arr = xarray.DataArray.from_series(pd.Series(range(15), index=index)).stack(index0=[""level_0"", ""level_1""]) arr.unstack(""index0"") arr2 = xarray.concat([arr, arr], dim=""index2"") arr2.unstack(""index0"") ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]) Coordinates: * level_0 (level_0) int64 0 1 2 * level_1 (level_1) int64 0 1 2 3 4 Traceback (most recent call last): File """", line 1, in File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataarray.py"", line 2402, in unstack ds = self._to_temp_dataset().unstack(dim, fill_value, sparse) File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py"", line 4618, in unstack raise ValueError( ValueError: cannot unstack dimensions that do not have exactly one multi-index: ('index0',) ``` ### Anything else we need to know? The eventual problem to which I wish to apply the solution has two stacked dimensions rather than one, but that's likely irrelevant. ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.76.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.6.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: 3.2.1.post0 bottleneck: 1.3.5 dask: 2022.7.1 distributed: 2022.7.1 matplotlib: 3.5.1 cartopy: 0.20.3 seaborn: 0.12.0 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 61.3.1 pip: 22.0.4 conda: 4.14.0 pytest: 7.1.3 IPython: None sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7076/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue 1192449540,I_kwDOAMm_X85HE1YE,6439,Unstacking the diagonals of a sequence of matrices raises ValueError: IndexVariable objects must be 1-dimensional,22566757,closed,0,,,5,2022-04-05T00:09:55Z,2022-05-02T19:05:19Z,2022-05-02T19:00:47Z,CONTRIBUTOR,,,,"### What happened? In my work, I produced a sequence of covariance matrices for a 2-D quantity. I wanted to extract the diagonal of the covariance matrices, then make that diagonal 2-D so I could plot it. I could unstack the 2-D dimensions in the sequence of covariance matrices without issue. I figured out how to extract the diagonal of the covariance matrices. Unstacking the diagonal using the same procedure raised a `ValueError`. ### What did you expect to happen? I expected the sequence of one-dimensional diagonals to unstack into a sequence of two-dimensional fields so I could plot them with pcolormesh. I can make this happen by unstacking the two dimensions (producing a 5-D DataArray) and extracting the diagonals from that, but I don't see a reason it shouldn't work in the other order. ### Minimal Complete Verifiable Example ```Python import numpy as np import xarray # Working: test = xarray.DataArray( np.eye(12), dims=(""dim0"", ""adj_dim0""), coords={ ""dim0_0"": ((""dim0"",), np.repeat(np.arange(3), 4)), ""dim0_1"": ((""dim0"",), np.tile(np.arange(4), 3)), }, ) unstacked = test.set_index(dim0=[""dim0_0"", ""dim0_1""]).unstack(""dim0"") diag_index = xarray.DataArray(np.arange(test.shape[0]), dims=(""diag"",)) unstacked_diag = ( test.isel(dim0=diag_index, adj_dim0=diag_index) .set_index(diag=[""dim0_0"", ""dim0_1""]) .unstack(""diag"") ) # Not working: test = xarray.DataArray( dims=(""dim1"", ""dim0"", ""adj_dim0""), data=np.tile(np.eye(12), (2, 1, 1)), coords={ ""dim0"": np.arange(12), ""dim0_0"": ((""dim0"",), np.repeat(np.arange(3), 4)), ""dim0_1"": ((""dim0"",), np.tile(np.arange(4), 3)), ""adj_dim0"": np.arange(12), ""adj_dim0_0"": ((""adj_dim0"",), np.repeat(np.arange(3), 4)), ""adj_dim0_1"": ((""adj_dim0"",), np.tile(np.arange(4), 3)), }, ) unstacked = test.set_index( dim0=[""dim0_0"", ""dim0_1""], adj_dim0=[""adj_dim0_0"", ""adj_dim0_1""] ).unstack([""dim0"", ""adj_dim0""]) diag_index0 = xarray.DataArray(np.arange(unstacked.shape[1]), dims=(""diag_0"",)) diag_index1 = xarray.DataArray(np.arange(unstacked.shape[2]), dims=(""diag_1"",)) unstacked_diag = unstacked.isel( dim0_0=diag_index0, dim0_1=diag_index1, adj_dim0_0=diag_index0, adj_dim0_1=diag_index1, ) diag_index = xarray.DataArray(np.arange(test.shape[1]), dims=(""diag"",)) test.isel(dim0=diag_index, adj_dim0=diag_index).set_index( diag=[""dim0_0"", ""dim0_1""] ).unstack(""diag"") ``` ### Relevant log output ```Python Traceback (most recent call last): File ""./test.py"", line 50, in ).unstack(""diag"") File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataarray.py"", line 2201, in unstack ds = self._to_temp_dataset().unstack(dim, fill_value, sparse) File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py"", line 4214, in unstack result = result._unstack_once(dim, fill_value, sparse) File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py"", line 4070, in _unstack_once variables[name] = var._unstack_once( File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py"", line 1690, in _unstack_once return self._replace(dims=new_dims, data=data) File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py"", line 978, in _replace return type(self)(dims, data, attrs, encoding, fastpath=True) File ""~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py"", line 2668, in __init__ raise ValueError(f""{type(self).__name__} objects must be 1-dimensional"") ValueError: IndexVariable objects must be 1-dimensional ``` ### Anything else we need to know? _No response_ ### Environment INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:39:04) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.59.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None setuptools: 61.3.1 pip: 22.0.4 conda: None pytest: None IPython: None sphinx: None ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6439/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 424265093,MDExOlB1bGxSZXF1ZXN0MjYzNjU2Nzcz,2844,Read grid mapping and bounds as coords,22566757,closed,0,,,38,2019-03-22T15:25:37Z,2021-02-17T16:35:56Z,2021-02-17T16:35:56Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2844," I prefer having these as coordinates rather than data variables. This does not cooperate with slicing/pulling out individual variables. `grid_mapping` should only be associated with variables that have horizontal dimensions or coordinates. `bounds` should stay associated despite having more dimensions. I have not implemented similar functionality for the iris conversions. An alternate approach to dealing with bounds (not used here) is to use a `pandas.IntervalIndex` http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.IntervalIndex.html#pandas.IntervalIndex and use where the coordinate is within its cell to determine on which side the intervals are closed (`x_dim == x_dim_bnds[:, 0]` corresponds to ""left"", `x_dim == x_dim_bnds[:, 1]` corresponds to ""right"", and anything else is ""neither""). This would stay through slicing and might already be used for `.groupby_bins()`, but would not generalize to boundaries of multidimensional coordinates unless someone implements a multidimensional generalization of `pd.IntervalIndex` - [ ] Closes #xxxx - [X] Tests added - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2844/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 424262546,MDExOlB1bGxSZXF1ZXN0MjYzNjU0NzQ0,2843,Allow passing _FillValue=False in encoding for vlen str variables.,22566757,closed,0,,,2,2019-03-22T15:20:27Z,2019-03-30T14:04:30Z,2019-03-30T14:04:13Z,CONTRIBUTOR,,0,pydata/xarray/pulls/2843," The documentation seems to imply that passing _FillValue=False in the encoding works to set no fill value for any value. These changes to `netCDF4_.py` and `h5netcdf_.py` seem to allow this for variables that are vlen strings (dtype `str` rather than `""S1""`): I have used the code-path in `netCDF4_.py` in real code and know that it at least allows the save to complete. Allowing _FillValue=False makes it easier to explicitly exclude `_FillValue` from being written for coordinate variables, which some CF-compliance checkers complain about. - [ ] Closes #xxxx - [X] Tests added - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2843/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull