home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

4 rows where type = "issue" and user = 19226431 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: comments, created_at (date), updated_at (date), closed_at (date)

state 2

  • closed 2
  • open 2

type 1

  • issue · 4 ✖

repo 1

  • xarray 4
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1330149534 I_kwDOAMm_X85PSHie 6881 Alignment of dataset with MultiIndex fails after applying xr.concat FabianHofmann 19226431 closed 0     0 2022-08-05T16:42:05Z 2022-08-25T11:15:55Z 2022-08-25T11:15:55Z CONTRIBUTOR      

What happened?

After applying the concat function to a dataset with a Multiindex, a lot of functions related to indexing are broken. For example, it is not possible to apply reindex_like to itself anymore.

The error is raised in the alignment module. It seems that the function find_matching_indexes does not find indexes that belong to the same dimension.

What did you expect to happen?

I expected the alignment to be functional and that these basic functions work.

Minimal Complete Verifiable Example

```Python import xarray as xr import pandas as pd

index = pd.MultiIndex.from_product([[1,2], ['a', 'b']], names=('level1', 'level2')) index.name = 'dim'

var = xr.DataArray(1, coords=[index]) ds = xr.Dataset({"var":var})

new = xr.concat([ds], dim='newdim') xr.Dataset(new) # breaks new.reindex_like(new) # breaks ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python Traceback (most recent call last):

File "/tmp/ipykernel_407170/4030736219.py", line 11, in <cell line: 11> xr.Dataset(new) # breaks

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/dataset.py", line 599, in init variables, coord_names, dims, indexes, _ = merge_data_and_coords(

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/merge.py", line 575, in merge_data_and_coords return merge_core(

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/merge.py", line 752, in merge_core aligned = deep_align(

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/alignment.py", line 827, in deep_align aligned = align(

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/alignment.py", line 764, in align aligner.align()

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/alignment.py", line 550, in align self.assert_no_index_conflict()

File "/home/fabian/.miniconda3/lib/python3.10/site-packages/xarray/core/alignment.py", line 319, in assert_no_index_conflict raise ValueError(

ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'dim' (2 conflicting indexes) Conflicting indexes may occur when - they relate to different sets of coordinate and/or dimension names - they don't have the same type - they may be used to reindex data along common dimensions ```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0] python-bits: 64 OS: Linux OS-release: 5.15.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.21.6 scipy: 1.8.1 netCDF4: 1.6.0 pydap: None h5netcdf: None h5py: 3.6.0 Nio: None zarr: None cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: 1.3.4 dask: 2022.6.1 distributed: 2022.6.1 matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.3.0 cupy: None pint: None sparse: 0.13.0 flox: None numpy_groupies: None setuptools: 61.2.0 pip: 22.1.2 conda: 4.13.0 pytest: 7.1.2 IPython: 7.33.0 sphinx: 5.0.2 /home/fabian/.miniconda3/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.")
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6881/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
784042442 MDU6SXNzdWU3ODQwNDI0NDI= 4796 Use apply_ufunc for unary funcs FabianHofmann 19226431 open 0     3 2021-01-12T08:56:03Z 2022-04-18T16:31:02Z   CONTRIBUTOR      

DataArray.clip() of a chunked array returns an assertion error as soon as the argument takes an chunked array. With non-chunked arrays every thing works as intended.

python x = xr.DataArray(np.random.uniform(size=[100, 100])).chunk(10) x.clip(max=x)

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.9 | packaged by conda-forge | (default, Dec 9 2020, 21:08:20) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-60-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.3 xarray: 0.16.2 pandas: 1.2.0 numpy: 1.19.5 scipy: 1.5.0 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.0 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.1.3 cartopy: 0.18.0 seaborn: 0.11.0 numbagg: None pint: None setuptools: 49.2.1.post20200807 pip: 20.2.1 conda: 4.8.3 pytest: 6.0.1 IPython: 7.11.1 sphinx: 3.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4796/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
830040696 MDU6SXNzdWU4MzAwNDA2OTY= 5024 xr.DataArray.sum() converts string objects into unicode FabianHofmann 19226431 open 0     0 2021-03-12T11:47:06Z 2022-04-09T01:40:09Z   CONTRIBUTOR      

What happened:

When summing over all axes of a DataArray with strings of dtype object, the result is a one-size unicode DataArray.

What you expected to happen:

I expected the summation would preserve the dtype, meaning the one-size DataArray would be of dtype object

Minimal Complete Verifiable Example:

ds = xr.DataArray('a', [range(3), range(3)]).astype(object) ds.sum()

Output <xarray.DataArray ()> array('aaaaaaaaa', dtype='<U9')

On the other hand, when summing over one dimension only, the dtype is preserved ds.sum('dim_0')

Output: <xarray.DataArray (dim_1: 3)> array(['aaa', 'aaa', 'aaa'], dtype=object) Coordinates: * dim_1 (dim_1) int64 0 1 2

Anything else we need to know?:

The problem becomes relevant as soon as dask is used in the workflow. Dask expects the aggregated DataArray to be of dtype object which will likely lead to errors in the operations to follow.

Probably the behavior comes from creating a new DataArray after the reduction with np.sum() (which itself leads results in a pure python string).

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-66-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.2 pandas: 1.2.1 numpy: 1.19.5 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.7.4 h5py: 3.1.0 Nio: None zarr: 2.3.2 cftime: 1.3.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.0 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.01.1 distributed: 2021.01.1 matplotlib: 3.3.3 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: None setuptools: 52.0.0.post20210125 pip: 21.0 conda: 4.9.2 pytest: 6.2.2 IPython: 7.19.0 sphinx: 3.4.3
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5024/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
1052736383 I_kwDOAMm_X84-v3t_ 5983 preserve chunked data when creating DataArray from itself FabianHofmann 19226431 closed 0     4 2021-11-13T18:00:24Z 2022-01-13T17:02:47Z 2022-01-13T17:02:47Z CONTRIBUTOR      

What happened:

When creating a new DataArray from a DataArray with chunked data, the underlying dask array is converted to a numpy array.

What you expected to happen:

I expected the underlying dask array to be preseved when creating a new DataArray instance.

Minimal Complete Verifiable Example:

```python import xarray as xr import numpy as np from dask import array

d = np.ones((10, 10)) x = array.from_array(d, chunks=5)

da = xr.DataArray(x) # this is chunked xr.DataArray(da) # this is not chunked anymore ```

Anything else we need to know?:

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.11.0-40-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.3.3 numpy: 1.20.3 scipy: 1.7.1 netCDF4: 1.5.6 pydap: None h5netcdf: 0.11.0 h5py: 3.2.1 Nio: None zarr: 2.10.1 cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.6 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.09.1 distributed: 2021.09.1 matplotlib: 3.4.3 cartopy: 0.19.0.post1 seaborn: 0.11.2 numbagg: None pint: None setuptools: 58.0.4 pip: 21.2.4 conda: 4.10.3 pytest: 6.2.5 IPython: 7.27.0 sphinx: 4.2.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5983/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 22.288ms · About: xarray-datasette