home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

4 rows where user = 22566757 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 2

  • issue 2
  • pull 2

state 2

  • closed 3
  • open 1

repo 1

  • xarray 4
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1384465119 I_kwDOAMm_X85ShULf 7076 Can't unstack concatenated DataArrays DWesl 22566757 open 0     9 2022-09-24T00:50:09Z 2023-02-21T17:11:35Z   CONTRIBUTOR      

What happened?

I had a collection of DataArrays with a stacked dimension (dimension whose corresponding index is a MultiIndex). I concatenated them into a single DataArray, then tried to unstack the stacked dimension, which failed. Performing the operations in the other order works (unstacking each DataArray, then concatenating the unstacked arrays).

What did you expect to happen?

I expected that concatenating the arrays then unstacking them would produce the same array as unstacking them then concatenating them, but with the possibility of saving the intermediate concatenated-but-still-stacked DataArray for later use as a template.

Minimal Complete Verifiable Example

```Python import pandas as pd import xarray index = pd.MultiIndex.from_product([range(3), range(5)]) arr = xarray.DataArray.from_series(pd.Series(range(15), index=index)).stack(index0=["level_0", "level_1"]) arr.unstack("index0")

arr2 = xarray.concat([arr, arr], dim="index2") arr2.unstack("index0") ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python <xarray.DataArray (level_0: 3, level_1: 5)> array([[ 0, 1, 2, 3, 4], [ 5, 6, 7, 8, 9], [10, 11, 12, 13, 14]]) Coordinates: * level_0 (level_0) int64 0 1 2 * level_1 (level_1) int64 0 1 2 3 4

Traceback (most recent call last): File "<stdin>", line 1, in <module> File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataarray.py", line 2402, in unstack ds = self._to_temp_dataset().unstack(dim, fill_value, sparse) File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py", line 4618, in unstack raise ValueError( ValueError: cannot unstack dimensions that do not have exactly one multi-index: ('index0',) ```

Anything else we need to know?

The eventual problem to which I wish to apply the solution has two stacked dimensions rather than one, but that's likely irrelevant.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.76.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.6.0 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: 3.2.1.post0 bottleneck: 1.3.5 dask: 2022.7.1 distributed: 2022.7.1 matplotlib: 3.5.1 cartopy: 0.20.3 seaborn: 0.12.0 numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 61.3.1 pip: 22.0.4 conda: 4.14.0 pytest: 7.1.3 IPython: None sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7076/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  reopened xarray 13221727 issue
1192449540 I_kwDOAMm_X85HE1YE 6439 Unstacking the diagonals of a sequence of matrices raises ValueError: IndexVariable objects must be 1-dimensional DWesl 22566757 closed 0     5 2022-04-05T00:09:55Z 2022-05-02T19:05:19Z 2022-05-02T19:00:47Z CONTRIBUTOR      

What happened?

In my work, I produced a sequence of covariance matrices for a 2-D quantity. I wanted to extract the diagonal of the covariance matrices, then make that diagonal 2-D so I could plot it.

I could unstack the 2-D dimensions in the sequence of covariance matrices without issue. I figured out how to extract the diagonal of the covariance matrices. Unstacking the diagonal using the same procedure raised a ValueError.

What did you expect to happen?

I expected the sequence of one-dimensional diagonals to unstack into a sequence of two-dimensional fields so I could plot them with pcolormesh.

I can make this happen by unstacking the two dimensions (producing a 5-D DataArray) and extracting the diagonals from that, but I don't see a reason it shouldn't work in the other order.

Minimal Complete Verifiable Example

```Python import numpy as np import xarray

Working:

test = xarray.DataArray( np.eye(12), dims=("dim0", "adj_dim0"), coords={ "dim0_0": (("dim0",), np.repeat(np.arange(3), 4)), "dim0_1": (("dim0",), np.tile(np.arange(4), 3)), }, ) unstacked = test.set_index(dim0=["dim0_0", "dim0_1"]).unstack("dim0") diag_index = xarray.DataArray(np.arange(test.shape[0]), dims=("diag",)) unstacked_diag = ( test.isel(dim0=diag_index, adj_dim0=diag_index) .set_index(diag=["dim0_0", "dim0_1"]) .unstack("diag") )

Not working:

test = xarray.DataArray( dims=("dim1", "dim0", "adj_dim0"), data=np.tile(np.eye(12), (2, 1, 1)), coords={ "dim0": np.arange(12), "dim0_0": (("dim0",), np.repeat(np.arange(3), 4)), "dim0_1": (("dim0",), np.tile(np.arange(4), 3)), "adj_dim0": np.arange(12), "adj_dim0_0": (("adj_dim0",), np.repeat(np.arange(3), 4)), "adj_dim0_1": (("adj_dim0",), np.tile(np.arange(4), 3)), }, ) unstacked = test.set_index( dim0=["dim0_0", "dim0_1"], adj_dim0=["adj_dim0_0", "adj_dim0_1"] ).unstack(["dim0", "adj_dim0"])

diag_index0 = xarray.DataArray(np.arange(unstacked.shape[1]), dims=("diag_0",)) diag_index1 = xarray.DataArray(np.arange(unstacked.shape[2]), dims=("diag_1",)) unstacked_diag = unstacked.isel( dim0_0=diag_index0, dim0_1=diag_index1, adj_dim0_0=diag_index0, adj_dim0_1=diag_index1, )

diag_index = xarray.DataArray(np.arange(test.shape[1]), dims=("diag",)) test.isel(dim0=diag_index, adj_dim0=diag_index).set_index( diag=["dim0_0", "dim0_1"] ).unstack("diag") ```

Relevant log output

Python Traceback (most recent call last): File "./test.py", line 50, in <module> ).unstack("diag") File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataarray.py", line 2201, in unstack ds = self._to_temp_dataset().unstack(dim, fill_value, sparse) File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py", line 4214, in unstack result = result._unstack_once(dim, fill_value, sparse) File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/dataset.py", line 4070, in _unstack_once variables[name] = var._unstack_once( File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py", line 1690, in _unstack_once return self._replace(dims=new_dims, data=data) File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py", line 978, in _replace return type(self)(dims, data, attrs, encoding, fastpath=True) File "~/.conda/envs/plotting/lib/python3.10/site-packages/xarray/core/variable.py", line 2668, in __init__ raise ValueError(f"{type(self).__name__} objects must be 1-dimensional") ValueError: IndexVariable objects must be 1-dimensional

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:39:04) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.59.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1

xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.22.3 scipy: 1.8.0 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: None numbagg: None fsspec: None cupy: None pint: None sparse: None setuptools: 61.3.1 pip: 22.0.4 conda: None pytest: None IPython: None sphinx: None

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6439/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
424265093 MDExOlB1bGxSZXF1ZXN0MjYzNjU2Nzcz 2844 Read grid mapping and bounds as coords DWesl 22566757 closed 0     38 2019-03-22T15:25:37Z 2021-02-17T16:35:56Z 2021-02-17T16:35:56Z CONTRIBUTOR   0 pydata/xarray/pulls/2844

I prefer having these as coordinates rather than data variables.

This does not cooperate with slicing/pulling out individual variables. grid_mapping should only be associated with variables that have horizontal dimensions or coordinates. bounds should stay associated despite having more dimensions.

I have not implemented similar functionality for the iris conversions.

An alternate approach to dealing with bounds (not used here) is to use a pandas.IntervalIndex http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.IntervalIndex.html#pandas.IntervalIndex and use where the coordinate is within its cell to determine on which side the intervals are closed (x_dim == x_dim_bnds[:, 0] corresponds to "left", x_dim == x_dim_bnds[:, 1] corresponds to "right", and anything else is "neither"). This would stay through slicing and might already be used for .groupby_bins(), but would not generalize to boundaries of multidimensional coordinates unless someone implements a multidimensional generalization of pd.IntervalIndex

  • [ ] Closes #xxxx
  • [X] Tests added
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2844/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull
424262546 MDExOlB1bGxSZXF1ZXN0MjYzNjU0NzQ0 2843 Allow passing _FillValue=False in encoding for vlen str variables. DWesl 22566757 closed 0     2 2019-03-22T15:20:27Z 2019-03-30T14:04:30Z 2019-03-30T14:04:13Z CONTRIBUTOR   0 pydata/xarray/pulls/2843

The documentation seems to imply that passing _FillValue=False in the encoding works to set no fill value for any value. These changes to netCDF4_.py and h5netcdf_.py seem to allow this for variables that are vlen strings (dtype str rather than "S1"): I have used the code-path in netCDF4_.py in real code and know that it at least allows the save to complete.

Allowing _FillValue=False makes it easier to explicitly exclude _FillValue from being written for coordinate variables, which some CF-compliance checkers complain about.

  • [ ] Closes #xxxx
  • [X] Tests added
  • [ ] Fully documented, including whats-new.rst for all changes and api.rst for new API
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2843/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 24.002ms · About: xarray-datasette