id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2019645081,I_kwDOAMm_X854YVaZ,8498,Allow some notion of ordering in Dataset dims,5635139,closed,0,,,5,2023-11-30T22:57:23Z,2023-12-08T19:22:56Z,2023-12-08T19:22:55Z,MEMBER,,,,"### What is your issue? Currently a `DataArray`'s dims are ordered, while a `Dataset`'s are not. Do we gain anything from have unordered dims in a Dataset? Could we have an ordering without enforcing it on every variable? Here's one proposal, with fairly wide error-bars: - Datasets have a dim order, which is set at construction time or through `.transpose` - Currently `.transpose` changes the order of each variable's dims, but not the dataset's - If dims aren't supplied, we can just use the first variable's - Variables don't have to conform to that order — `.assign(foo=differently_ordered)` maintains the differently ordered dims. So this doesn't limit any current functionality. - When there are transformations which change dim ordering, Xarray is ""allowed"" to transpose variables to the dataset's ordering. Currently Xarray is ""allowed"" to change dim order arbitrarily — for example to put a core dim last. IIUC, we'd prefer to set a non-arbitrary order, but we don't have one to reference. - This would remove a bunch of boilerplate from methods that save the ordering, run `.apply_ufunc` and then reorder in the original order[^1] What do folks think? [^1]: though also we could do this in `.apply_ufunc`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8498/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,not_planned,13221727,issue 1995489227,I_kwDOAMm_X8528L_L,8455,Errors when assigning using `.from_pandas_multiindex`,5635139,closed,0,,,3,2023-11-15T20:09:15Z,2023-12-04T19:10:12Z,2023-12-04T19:10:11Z,MEMBER,,,,"### What happened? Very possibly this is user-error, forgive me if so. I'm trying to transition some code from the previous assignment of MultiIndexes, to the new world. Here's an MCVE: ### What did you expect to happen? _No response_ ### Minimal Complete Verifiable Example ```Python da = xr.tutorial.open_dataset(""air_temperature"")['air'] # old code, works, but with a warning da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) :1: FutureWarning: the `pandas.MultiIndex` object(s) passed as 'foo' coordinate(s) or data variable(s) will no longer be implicitly promoted and wrapped into multiple indexed coordinates in the future (i.e., one coordinate for each multi-index level + one dimension coordinate). If you want to keep this behavior, you need to first wrap it explicitly using `mindex_coords = xarray.Coordinates.from_pandas_multiindex(mindex_obj, 'dim')` and pass it as coordinates, e.g., `xarray.Dataset(coords=mindex_coords)`, `dataset.assign_coords(mindex_coords)` or `dataarray.assign_coords(mindex_coords)`. da.expand_dims('foo').assign_coords(foo=(pd.MultiIndex.from_tuples([(1,2)]))) Out[25]: array([[[[241.2 , 242.5 , 243.5 , ..., 232.79999, 235.5 , 238.59999], ... [297.69 , 298.09 , 298.09 , ..., 296.49 , 296.19 , 295.69 ]]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 * foo (foo) object MultiIndex * foo_level_0 (foo) int64 1 * foo_level_1 (foo) int64 2 # new code — seems to get confused between the number of values in the index — 1 — and the number of levels — 3 including the parent: da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo')) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[26], line 1 ----> 1 da.expand_dims('foo').assign_coords(foo=xr.Coordinates.from_pandas_multiindex(pd.MultiIndex.from_tuples([(1,2)]), dim='foo')) File ~/workspace/xarray/xarray/core/common.py:621, in DataWithCoords.assign_coords(self, coords, **coords_kwargs) 618 else: 619 results = self._calc_assign_results(coords_combined) --> 621 data.coords.update(results) 622 return data File ~/workspace/xarray/xarray/core/coordinates.py:566, in Coordinates.update(self, other) 560 # special case for PandasMultiIndex: updating only its dimension coordinate 561 # is still allowed but depreciated. 562 # It is the only case where we need to actually drop coordinates here (multi-index levels) 563 # TODO: remove when removing PandasMultiIndex's dimension coordinate. 564 self._drop_coords(self._names - coords_to_align._names) --> 566 self._update_coords(coords, indexes) File ~/workspace/xarray/xarray/core/coordinates.py:834, in DataArrayCoordinates._update_coords(self, coords, indexes) 832 coords_plus_data = coords.copy() 833 coords_plus_data[_THIS_ARRAY] = self._data.variable --> 834 dims = calculate_dimensions(coords_plus_data) 835 if not set(dims) <= set(self.dims): 836 raise ValueError( 837 ""cannot add coordinates with new dimensions to a DataArray"" 838 ) File ~/workspace/xarray/xarray/core/variable.py:3014, in calculate_dimensions(variables) 3012 last_used[dim] = k 3013 elif dims[dim] != size: -> 3014 raise ValueError( 3015 f""conflicting sizes for dimension {dim!r}: "" 3016 f""length {size} on {k!r} and length {dims[dim]} on {last_used!r}"" 3017 ) 3018 return dims ValueError: conflicting sizes for dimension 'foo': length 1 on and length 3 on {'lat': 'lat', 'lon': 'lon', 'time': 'time', 'foo': 'foo'} ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. - [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output _No response_ ### Anything else we need to know? _No response_ ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.9.18 (main, Nov 2 2023, 16:51:22) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: arm64 processor: arm byteorder: little LC_ALL: en_US.UTF-8 LANG: None LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.10.2.dev10+gccc8f998 pandas: 2.1.1 numpy: 1.25.2 scipy: 1.11.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.16.0 cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.4.0 distributed: 2023.7.1 matplotlib: 3.5.1 cartopy: None seaborn: None numbagg: 0.2.3.dev30+gd26e29e fsspec: 2021.11.1 cupy: None pint: None sparse: None flox: None numpy_groupies: 0.9.19 setuptools: 68.2.2 pip: 23.3.1 conda: None pytest: 7.4.0 mypy: 1.6.0 IPython: 8.15.0 sphinx: 4.3.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8455/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,not_planned,13221727,issue