home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1210267320

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1210267320 I_kwDOAMm_X85IIza4 6505 Dropping a MultiIndex variable raises an error after explicit indexes refactor 1217238 closed 0     3 2022-04-20T22:07:26Z 2022-07-21T14:46:58Z 2022-07-21T14:46:58Z MEMBER      

What happened?

With the latest released version of Xarray, it is possible to delete all variables corresponding to a MultiIndex by simply deleting the name of the MultiIndex.

After the explicit indexes refactor (i.e,. using the "main" development branch) this now raises error about how this would "corrupt" index state. This comes up when using drop() and assign_coords() and possibly some other methods.

This is not hard to work around, but we may want to consider this bug a blocker for the next Xarray release. I found the issue surfaced in several projects when attempting to use the new version of Xarray inside Google's codebase.

CC @benbovy in case you have any thoughts to share.

What did you expect to happen?

For now, we should preserve the behavior of deleting the variables corresponding to MultiIndex levels, but should issue a deprecation warning encouraging users to explicitly delete everything.

Minimal Complete Verifiable Example

```Python import xarray

array = xarray.DataArray( [[1, 2], [3, 4]], dims=['x', 'y'], coords={'x': ['a', 'b']}, ) stacked = array.stack(z=['x', 'y']) print(stacked.drop('z')) print() print(stacked.assign_coords(z=[1, 2, 3, 4])) ```

Relevant log output

```Python ValueError Traceback (most recent call last) Input In [1], in <cell line: 9>() 3 array = xarray.DataArray( 4 [[1, 2], [3, 4]], 5 dims=['x', 'y'], 6 coords={'x': ['a', 'b']}, 7 ) 8 stacked = array.stack(z=['x', 'y']) ----> 9 print(stacked.drop('z')) 10 print() 11 print(stacked.assign_coords(z=[1, 2, 3, 4]))

File ~/dev/xarray/xarray/core/dataarray.py:2425, in DataArray.drop(self, labels, dim, errors, labels_kwargs) 2408 def drop( 2409 self, 2410 labels: Mapping = None, (...) 2414 labels_kwargs, 2415 ) -> DataArray: 2416 """Backward compatible method based on drop_vars and drop_sel 2417 2418 Using either drop_vars or drop_sel is encouraged (...) 2423 DataArray.drop_sel 2424 """ -> 2425 ds = self._to_temp_dataset().drop(labels, dim, errors=errors) 2426 return self._from_temp_dataset(ds)

File ~/dev/xarray/xarray/core/dataset.py:4590, in Dataset.drop(self, labels, dim, errors, **labels_kwargs) 4584 if dim is None and (is_scalar(labels) or isinstance(labels, Iterable)): 4585 warnings.warn( 4586 "dropping variables using drop will be deprecated; using drop_vars is encouraged.", 4587 PendingDeprecationWarning, 4588 stacklevel=2, 4589 ) -> 4590 return self.drop_vars(labels, errors=errors) 4591 if dim is not None: 4592 warnings.warn( 4593 "dropping labels using list-like labels is deprecated; using " 4594 "dict-like arguments with drop_sel, e.g. `ds.drop_sel(dim=[labels]).", 4595 DeprecationWarning, 4596 stacklevel=2, 4597 )

File ~/dev/xarray/xarray/core/dataset.py:4549, in Dataset.drop_vars(self, names, errors) 4546 if errors == "raise": 4547 self._assert_all_in_dataset(names) -> 4549 assert_no_index_corrupted(self.xindexes, names) 4551 variables = {k: v for k, v in self._variables.items() if k not in names} 4552 coord_names = {k for k in self._coord_names if k in variables}

File ~/dev/xarray/xarray/core/indexes.py:1394, in assert_no_index_corrupted(indexes, coord_names) 1392 common_names_str = ", ".join(f"{k!r}" for k in common_names) 1393 index_names_str = ", ".join(f"{k!r}" for k in index_coords) -> 1394 raise ValueError( 1395 f"cannot remove coordinate(s) {common_names_str}, which would corrupt " 1396 f"the following index built from coordinates {index_names_str}:\n" 1397 f"{index}" 1398 )

ValueError: cannot remove coordinate(s) 'z', which would corrupt the following index built from coordinates 'z', 'x', 'y': <xarray.core.indexes.PandasMultiIndex object at 0x148c95150> ```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: 33cdabd261b5725ac357c2823bd0f33684d3a954 python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:42:03) [Clang 12.0.1 ] python-bits: 64 OS: Darwin OS-release: 21.4.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 0.18.3.dev137+g96c56836 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.11.3 cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.04.1 distributed: 2022.4.1 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2022.3.0 cupy: None pint: None sparse: None setuptools: 62.1.0 pip: 22.0.4 conda: None pytest: 7.1.1 IPython: 8.2.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6505/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 0.632ms · About: xarray-datasette