home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

3 rows where repo = 13221727 and user = 14276158 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • pull 2
  • issue 1

state 2

  • closed 2
  • open 1

repo 1

  • xarray · 3 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1402168223 I_kwDOAMm_X85Tk2Of 7148 Concatenate using Multiindex cannot be unstacked anymore lpilz 14276158 open 0     3 2022-10-09T06:23:06Z 2022-10-10T08:16:38Z   CONTRIBUTOR      

What happened?

When trying to concatenate data using a Pandas MultiIndex and then unstack it to get two independent dimensions (e.g. for varying different parameters in a simulation), the unstack errors. I have seen different errors with different data (MVE errors with ValueError: IndexVariable objects must be 1-dimensional, but my data errors with ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'concat_dim' (2 conflicting indexes)).

One hint at the bug might be that conc._indexes shows more indexes then display(conc).

What did you expect to happen?

Originally (I think it was v2022.3.0) , it used to unstack neatly into the two levels of the multiindex as separate dimensions.

Minimal Complete Verifiable Example

```Python import xarray as xr import numpy as np import pandas as pd

ds = xr.Dataset(data_vars={"a": (("dim1", "dim2"), np.arange(16).reshape(4,4))}, coords={"dim1": list(range(4)), "dim2": list(range(2,6))}) dslist = [ds for i in range(6)]

arrays = [ ["bar", "bar", "baz", "baz", "foo", "foo"], ["one", "two", "one", "two", "one", "two"], ] mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"])

conc = xr.concat(dslist, dim=mindex) conc.unstack("concat_dim") # this errors

conc = xr.concat(dslist, dim='concat_dim') conc = conc.assign_coords(dict(concat_dim=mindex)).unstack("concat_dim") # this does not ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

[Skip to left side bar](https://jupyterhub.dkrz.de/user/b381219/levante-spawner-advanced/lab/tree/home/b/b381219/software/phd_scripts/jupyter/Test.ipynb#) > / /phd_scripts/jupyter/ Name Last Modified import xarray as xr import numpy as np import pandas as pd ​ ds = xr.Dataset(data_vars={"a": (("dim1", "dim2"), np.arange(16).reshape(4,4))}, coords={"dim1": list(range(4)), "dim2": list(range(2,6))}) dslist = [ds for i in range(6)] ​ arrays = [ ["bar", "bar", "baz", "baz", "foo", "foo"], ["one", "two", "one", "two", "one", "two"], ] mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"]) ​ conc = xr.concat(dslist, dim=mindex) conc.unstack("concat_dim") --------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In [24], line 15 12 mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"]) 14 conc = xr.concat(dslist, dim=mindex) ---> 15 conc.unstack("concat_dim") File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4870, in Dataset.unstack(self, dim, fill_value, sparse) 4866 result = result._unstack_full_reindex( 4867 d, stacked_indexes[d], fill_value, sparse 4868 ) 4869 else: -> 4870 result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse) 4871 return result File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4706, in Dataset._unstack_once(self, dim, index_and_vars, fill_value, sparse) 4703 else: 4704 fill_value_ = fill_value -> 4706 variables[name] = var._unstack_once( 4707 index=clean_index, 4708 dim=dim, 4709 fill_value=fill_value_, 4710 sparse=sparse, 4711 ) 4712 else: 4713 variables[name] = var File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1764, in Variable._unstack_once(self, index, dim, fill_value, sparse) 1759 # Indexer is a list of lists of locations. Each list is the locations 1760 # on the new dimension. This is robust to the data being sparse; in that 1761 # case the destinations will be NaN / zero. 1762 data[(..., *indexer)] = reordered -> 1764 return self._replace(dims=new_dims, data=data) File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1017, in Variable._replace(self, dims, data, attrs, encoding) 1015 if encoding is _default: 1016 encoding = copy.copy(self._encoding) -> 1017 return type(self)(dims, data, attrs, encoding, fastpath=True) File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:2776, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath) 2774 super().__init__(dims, data, attrs, encoding, fastpath) 2775 if self.ndim != 1: -> 2776 raise ValueError(f"{type(self).__name__} objects must be 1-dimensional") 2778 # Unlike in Variable, always eagerly load values into memory 2779 if not isinstance(self._data, PandasIndexingAdapter): ValueError: IndexVariable objects must be 1-dimensional conc = xr.concat(dslist, dim='concat_dim') conc = conc.assign_coords(dict(concat_dim=index)).unstack("concat_dim") conc xarray.Dataset Dimensions: first: 3second: 2dim1: 4dim2: 4 Coordinates: first (first) object 'bar' 'baz' 'foo' second (second) object 'one' 'two' dim1 (dim1) int64 0 1 2 3 dim2 (dim2) int64 2 3 4 5 Data variables: a (dim1, dim2, first, second) int64 0 0 0 0 0 0 1 ... 15 15 15 15 15 15 Attributes: (0) xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 4.18.0-305.25.1.el8_4.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.8.1 xarray: 2022.9.0 pandas: 1.5.0 numpy: 1.23.3 scipy: 1.9.1 netCDF4: 1.6.1 pydap: None h5netcdf: 1.0.2 h5py: 3.7.0 Nio: None zarr: None cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.9.2 distributed: 2022.9.2 matplotlib: 3.6.0 cartopy: 0.21.0 seaborn: None numbagg: None fsspec: 2022.8.2 cupy: None pint: 0.19.2 sparse: None flox: None numpy_groupies: None setuptools: 65.4.1 pip: 22.2.2 conda: None pytest: None IPython: 8.5.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7148/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1147894038 PR_kwDOAMm_X84zVXuC 6292 Amended docs on how to add a new backend lpilz 14276158 closed 0     1 2022-02-23T10:11:06Z 2022-02-23T17:54:53Z 2022-02-23T17:54:47Z CONTRIBUTOR   0 pydata/xarray/pulls/6292

When trying to install a new xarray backend with poetry, I noticed that it should be xarray.backends instead of xarray_backends.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6292/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
1121761078 PR_kwDOAMm_X84x9VPA 6232 Amended docstring to reflect the actual behaviour of Dataset.map lpilz 14276158 closed 0     4 2022-02-02T10:39:30Z 2022-02-23T09:51:41Z 2022-02-23T09:51:27Z CONTRIBUTOR   0 pydata/xarray/pulls/6232

In MetPy/pull#2312, I noticed that the behavior of Dataset.map's kwarg keep_attrs was not in line with its docstring. This is because in xarray 0.16.2 (pydata/xarray#3595 & pydata/xarray#4195) this behavior was changed.

In short, keep_attrs=True now copies both, the Dataset's and the variables' attributes and adds them to the new objects - preventing any change to them by the mapped function. In contrast, keep_attrs= False discards the Dataset's attributes and does not touch the variables' attributes, enabling the mapped function to modify them.

Here, I propose an update to the keep_attrs docstring of Dataset.map which would more accurately reflect its current behavior. Please feel free to provide alternative formulations if you feel like this one misses the mark.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6232/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 4959.141ms · About: xarray-datasette