home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 553930127

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
553930127 MDU6SXNzdWU1NTM5MzAxMjc= 3717 reduce on groupby auto-adds axis argument and complains when axis argument is specified 6063709 open 0     3 2020-01-23T04:29:58Z 2022-04-06T15:38:59Z   CONTRIBUTOR      

The behaviour of reduce appears to have changed in recent versions of xarray such that previous code that worked now throws errors.

MCVE Code Sample

I have repurposed someone else's nice code sample for this, thanks!

```python import pandas as pd import xarray as xr import numpy as np

s_date = '1990-01-01' e_date = '2019-05-01' days = pd.date_range(start=s_date, end=e_date, freq='B', name='time') items = pd.Index([str(i) for i in range(300)], name = 'item') dat = xr.DataArray(np.random.rand(len(days), len(items)), coords=[days, items])

print(dat)

def simplesum(array, axis): print(axis) return np.sum(array, axis)

dat.groupby('time.month').reduce(simplesum) dat.groupby('time.month').reduce(simplesum, axis=0) ```

The reduce appears to insert an axis argument if none is specified. This is the output of the first groupby operations with no axis argument: python 0 0 0 0 0 0 0 0 0 0 0 0 Out[41]: <xarray.DataArray (month: 12, item: 300)> array([[330.18949303, 336.97901528, 337.80472647, ..., 322.37053342, 326.84789948, 342.22782336], [300.3301059 , 307.79967902, 322.53148357, ..., 310.20975273, 291.04344738, 310.56010997], [325.71587689, 337.25153307, 331.35493521, ..., 332.43547569, 328.23330226, 326.43909063], ..., [322.96255713, 321.44723754, 312.59983716, ..., 318.79682437, 315.81592617, 314.27316547], [294.29894222, 291.77253983, 310.85452639, ..., 314.0461447 , 298.99012623, 326.08321702], [323.6778518 , 332.71638634, 324.47244831, ..., 326.82774826, 322.09233181, 327.6385762 ]]) Coordinates: * item (item) object '0' '1' '2' '3' '4' ... '295' '296' '297' '298' '299' * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12

The second groupby with axis=0 argument throws an error: ```python


ValueError Traceback (most recent call last) <ipython-input-42-381dec6862e6> in <module> ----> 1 dat.groupby('time.month').reduce(simplesum, axis=0)

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in reduce(self, func, dim, axis, keep_attrs, shortcut, **kwargs) 836 check_reduce_dims(dim, self.dims) 837 --> 838 return self.map(reduce_array, shortcut=shortcut) 839 840

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in map(self, func, shortcut, args, kwargs) 755 grouped = self._iter_grouped() 756 applied = (maybe_wrap_array(arr, func(arr, *args, kwargs)) for arr in grouped) --> 757 return self._combine(applied, shortcut=shortcut) 758 759 def apply(self, func, shortcut=False, args=(), **kwargs):

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in _combine(self, applied, restore_coord_dims, shortcut) 774 def _combine(self, applied, restore_coord_dims=False, shortcut=False): 775 """Recombine the applied objects like the original.""" --> 776 applied_example, applied = peek_at(applied) 777 coord, dim, positions = self._infer_concat_args(applied_example) 778 if shortcut:

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/utils.py in peek_at(iterable) 180 """ 181 gen = iter(iterable) --> 182 peek = next(gen) 183 return peek, itertools.chain([peek], gen) 184

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in <genexpr>(.0) 754 else: 755 grouped = self._iter_grouped() --> 756 applied = (maybe_wrap_array(arr, func(arr, args, *kwargs)) for arr in grouped) 757 return self._combine(applied, shortcut=shortcut) 758

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/groupby.py in reduce_array(ar) 832 833 def reduce_array(ar): --> 834 return ar.reduce(func, dim, axis, keep_attrs=keep_attrs, **kwargs) 835 836 check_reduce_dims(dim, self.dims)

/g/data3/hh5/public/apps/miniconda3/envs/analysis3-20.01/lib/python3.7/site-packages/xarray/core/variable.py in reduce(self, func, dim, axis, keep_attrs, keepdims, allow_lazy, **kwargs) 1511 dim = None 1512 if dim is not None and axis is not None: -> 1513 raise ValueError("cannot supply both 'axis' and 'dim' arguments") 1514 1515 if dim is not None:

ValueError: cannot supply both 'axis' and 'dim' arguments ```

Expected Output

I would expect the output of both groupby operations to be the same, though reduce says it should flatten the input if there is no dim or axis argument supplied, it doesn't seem to do this.

The second groupby, with axis=0 argument works with older versions of xarray(0.13.0).

Problem Description

It is impossible to specify a dim argument to reduce. It defaults to axis=0 and when a different axis is specified it throws an error.

Output of xr.show_versions()

Version used and produces error:

INSTALLED VERSIONS ------------------ commit: None python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 22:33:48) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-80.11.2.el8_0.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: en_AU.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.1 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: installed h5netcdf: 0.7.4 h5py: 2.10.0 Nio: 1.5.5 zarr: 2.4.0 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: 1.1.1 cfgrib: 0.9.7.6 iris: 2.3.0 bottleneck: 1.3.1 dask: 2.9.2 distributed: 2.9.3 matplotlib: 2.2.4 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 45.0.0.post20200113 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None None

The version of xarray does not throw an error when axis argument is supplied:

INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-80.11.2.el8_0.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_AU.utf8 LANG: en_AU.ISO8859-1 LOCALE: en_AU.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.1 numpy: 1.17.2 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.9.0 Nio: 1.5.5 zarr: 2.3.2 cftime: 1.0.3.4 nc_time_axis: 1.2.0 PseudoNetCDF: None rasterio: None cfgrib: 0.9.7.2 iris: 2.2.1dev0 bottleneck: 1.2.1 dask: 2.4.0 distributed: 2.4.0 matplotlib: 2.2.4 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 41.2.0 pip: 19.2.3 conda: None pytest: 5.1.2 IPython: 7.8.0 sphinx: None None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3717/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 85.42ms · About: xarray-datasette