home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where repo = 13221727, state = "open" and user = 19226431 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue 2

state 1

  • open · 2 ✖

repo 1

  • xarray · 2 ✖
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
784042442 MDU6SXNzdWU3ODQwNDI0NDI= 4796 Use apply_ufunc for unary funcs FabianHofmann 19226431 open 0     3 2021-01-12T08:56:03Z 2022-04-18T16:31:02Z   CONTRIBUTOR      

DataArray.clip() of a chunked array returns an assertion error as soon as the argument takes an chunked array. With non-chunked arrays every thing works as intended.

python x = xr.DataArray(np.random.uniform(size=[100, 100])).chunk(10) x.clip(max=x)

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.9 | packaged by conda-forge | (default, Dec 9 2020, 21:08:20) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-60-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.3 xarray: 0.16.2 pandas: 1.2.0 numpy: 1.19.5 scipy: 1.5.0 netCDF4: 1.5.3 pydap: None h5netcdf: 0.7.4 h5py: 2.10.0 Nio: None zarr: 2.3.2 cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.0 cfgrib: None iris: None bottleneck: 1.2.1 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.1.3 cartopy: 0.18.0 seaborn: 0.11.0 numbagg: None pint: None setuptools: 49.2.1.post20200807 pip: 20.2.1 conda: 4.8.3 pytest: 6.0.1 IPython: 7.11.1 sphinx: 3.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4796/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue
830040696 MDU6SXNzdWU4MzAwNDA2OTY= 5024 xr.DataArray.sum() converts string objects into unicode FabianHofmann 19226431 open 0     0 2021-03-12T11:47:06Z 2022-04-09T01:40:09Z   CONTRIBUTOR      

What happened:

When summing over all axes of a DataArray with strings of dtype object, the result is a one-size unicode DataArray.

What you expected to happen:

I expected the summation would preserve the dtype, meaning the one-size DataArray would be of dtype object

Minimal Complete Verifiable Example:

ds = xr.DataArray('a', [range(3), range(3)]).astype(object) ds.sum()

Output <xarray.DataArray ()> array('aaaaaaaaa', dtype='<U9')

On the other hand, when summing over one dimension only, the dtype is preserved ds.sum('dim_0')

Output: <xarray.DataArray (dim_1: 3)> array(['aaa', 'aaa', 'aaa'], dtype=object) Coordinates: * dim_1 (dim_1) int64 0 1 2

Anything else we need to know?:

The problem becomes relevant as soon as dask is used in the workflow. Dask expects the aggregated DataArray to be of dtype object which will likely lead to errors in the operations to follow.

Probably the behavior comes from creating a new DataArray after the reduction with np.sum() (which itself leads results in a pure python string).

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.0-66-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.16.2 pandas: 1.2.1 numpy: 1.19.5 scipy: 1.6.0 netCDF4: 1.5.5.1 pydap: None h5netcdf: 0.7.4 h5py: 3.1.0 Nio: None zarr: 2.3.2 cftime: 1.3.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.0 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.01.1 distributed: 2021.01.1 matplotlib: 3.3.3 cartopy: 0.18.0 seaborn: 0.11.1 numbagg: None pint: None setuptools: 52.0.0.post20210125 pip: 21.0 conda: 4.9.2 pytest: 6.2.2 IPython: 7.19.0 sphinx: 3.4.3
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5024/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 42.891ms · About: xarray-datasette