home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where comments = 1, "updated_at" is on date 2022-03-20 and user = 20629530 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue 2

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1173980959 I_kwDOAMm_X85F-Ycf 6379 Dataset groupby returning DataArray broken in some cases aulemahal 20629530 closed 0     1 2022-03-18T20:07:37Z 2022-03-20T18:55:26Z 2022-03-20T18:55:26Z CONTRIBUTOR      

What happened?

Got a TypeError when resampling a dataset along a dimension, mapping a function to each group. The function returns a DataArray.

Failed with : TypeError: _overwrite_indexes() got an unexpected keyword argument 'variables'

What did you expect to happen?

This worked before the merging of #5692. A DataArray was returned as expected.

Minimal Complete Verifiable Example

```Python import xarray as xr

ds = xr.tutorial.open_dataset("air_temperature")

ds.resample(time="YS").map(lambda grp: grp.air.mean("time")) ```

Relevant log output

```Python

TypeError Traceback (most recent call last) Input In [37], in <module> ----> 1 ds.resample(time="YS").map(lambda grp: grp.air.mean("time"))

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/xarray/core/resample.py:300, in DatasetResample.map(self, func, args, shortcut, kwargs) 298 # ignore shortcut if set (for now) 299 applied = (func(ds, *args, kwargs) for ds in self._iter_grouped()) --> 300 combined = self._combine(applied) 302 return combined.rename({self._resample_dim: self._dim})

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/xarray/core/groupby.py:999, in DatasetGroupByBase._combine(self, applied) 997 index, index_vars = create_default_index_implicit(coord) 998 indexes = {k: index for k in index_vars} --> 999 combined = combined._overwrite_indexes(indexes, variables=index_vars) 1000 combined = self._maybe_restore_empty_groups(combined) 1001 combined = self._maybe_unstack(combined)

TypeError: _overwrite_indexes() got an unexpected keyword argument 'variables' ```

Anything else we need to know?

In the docstring of DatasetGroupBy.map it is not made clear that the passed function should return a dataset, but the opposite is also not said. This worked before and I think the issues comes from #5692, which introduced different signatures for DataArray._overwrite_indexes (which is called in my case) and Dataset._overwrite_indexes (which is expected by the new _combine).

If the function passed to Dataset.resample(...).map should only return Datasets then I believe a more explicit error is needed, as well as some notice in the docs and a breaking change entry in the changelog. If DataArrays should be accepted, then we have a regression here.

I may have time to help on this.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.16.13-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_CA.utf8 LOCALE: ('fr_CA', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 2022.3.1.dev16+g3ead17ea pandas: 1.4.0 numpy: 1.20.3 scipy: 1.7.1 netCDF4: 1.5.7 pydap: None h5netcdf: 0.11.0 h5py: 3.4.0 Nio: None zarr: 2.10.0 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.08.0 distributed: 2021.08.0 matplotlib: 3.4.3 cartopy: None seaborn: None numbagg: None fsspec: 2021.07.0 cupy: None pint: 0.18 sparse: None setuptools: 57.4.0 pip: 21.2.4 conda: None pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6379/reactions",
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  completed xarray 13221727 issue
1173997225 I_kwDOAMm_X85F-cap 6380 Attributes of concatenation coordinate are dropped aulemahal 20629530 closed 0     1 2022-03-18T20:31:17Z 2022-03-20T18:53:46Z 2022-03-20T18:53:46Z CONTRIBUTOR      

What happened?

When concatenating two objects with xr.concat along a new dimension given through a DataArray, the attributes of this given coordinate are lost in the concatenation.

What did you expect to happen?

I expected the concatenation coordinate to be identical to the 1D DataArray I gave to concat.

Minimal Complete Verifiable Example

```Python import xarray as xr

ds = xr.tutorial.open_dataset("air_temperature")

concat_dim = xr.DataArray([1, 2], dims=("condim",), attrs={"an_attr": "yep"}, name="condim")

out = xr.concat([ds, ds], concat_dim) out.condim.attrs ```

Before #5692, I get: {'an_attr': 'yep'} with the current master, I get: {}

Anything else we need to know?

I'm not 100% sure, but I think the change is due to xr.core.concat._calc_concat_dim_coord being replaced by xr.core.concat.__calc_concat_dim_index. The former didn't touch the concatenation coordinate, while the latter casts it as an index, thus dropping the attributes in the process.

If the solution is to add a check in xr.concat, I may have time to implement something simple.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.16.13-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_CA.utf8 LOCALE: ('fr_CA', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 2022.3.1.dev16+g3ead17ea pandas: 1.4.0 numpy: 1.20.3 scipy: 1.7.1 netCDF4: 1.5.7 pydap: None h5netcdf: 0.11.0 h5py: 3.4.0 Nio: None zarr: 2.10.0 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.08.0 distributed: 2021.08.0 matplotlib: 3.4.3 cartopy: None seaborn: None numbagg: None fsspec: 2021.07.0 cupy: None pint: 0.18 sparse: None setuptools: 57.4.0 pip: 21.2.4 conda: None pytest: 6.2.5 IPython: 8.0.1 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6380/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 63.436ms · About: xarray-datasette