id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1454832041,I_kwDOAMm_X85Wtvmp,7297,stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index,96822049,open,0,,,6,2022-11-18T10:12:52Z,2023-01-17T18:28:44Z,,NONE,,,,"### What is your issue? (See MVCE example) The combination `ds.stack().unstack()` doesn't entirely give back the original ds, when there's a datavariable that only depends on a subset of coords of the multi-index used for stacking. 1. Is this on purpose? And if so, what's the rationale? 2. I would imagine that it could also be more memory efficient, when the original indexes `x` and `y` are kept that make up the multi-index `(midx=[x,y])` after a `stack()` operation. Because then you don't have to express and thus repeat the values of dataarrays that only depend on a subset of the indexes that make up the multi-index. ### MVCE ``` # xarray==2022.11.0 import xarray as xr ds = xr.Dataset(coords={'x':[1,2], 'y':[3,4]}) ds['a'] = ds.x + 5 # # Dimensions: (x: 2, y: 2) # Coordinates: # * x (x) int32 1 2 # * y (y) int32 3 4 # Data variables: # a (x) int32 6 7 ds_stacked = ds.stack(midx=['x','y']) # # Dimensions: (midx: 4) # Coordinates: # * midx (midx) object MultiIndex # * x (midx) int32 1 1 2 2 # * y (midx) int32 3 4 3 4 # Data variables: # a (midx) int32 6 6 7 7 ds_unstacked = ds_stacked.unstack() # # Dimensions: (x: 2, y: 2) # Coordinates: # * x (x) int32 1 2 # * y (y) int32 3 4 # Data variables: # a (x, y) int32 6 6 7 7 ``` ### Expected `ds_unstacked` to be the same as `ds`. Instead the variable `a` has now also become a function of coordinate `y`, but that's not entirely correct. I.e., after `ds.stack()`, that the variable 'a' is still only dependent on the original coordinate 'x', which is just a part of the multi-index. ``` ds_stacked = ds.stack(midx=['x','y']) # # Dimensions: (midx: 4) # Coordinates: # * midx (midx) object MultiIndex # * x (midx) int32 1 1 2 2 # * y (midx) int32 3 4 3 4 # Data variables: # a (x) int32 6 6 7 7 ``` **Maybe for clarity** ``` # # Dimensions: (midx: 4) # Coordinates: # * midx (midx) object MultiIndex # * x (midx) int32 1 1 2 2 # * y (midx) int32 3 4 3 4 # Data variables: # a (midx.x) int32 6 6 7 7 ``` **Or maybe to save memory** Make a relation/difference between midx.x (repeated values of x due to stacking) and x (original unique values). ``` # # Dimensions: (midx: 4) # Coordinates: # * midx (midx) object MultiIndex # * x (midx) int32 1 1 2 2 # * y (midx) int32 3 4 3 4 # Data variables: # a (x) int32 6 7 ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7297/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue