home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 2064544219

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2064544219 I_kwDOAMm_X857DnHb 8583 Unexpected Dataset aggregation behavior when weighting 3169620 open 0     1 2024-01-03T19:32:11Z 2024-01-04T19:12:58Z   CONTRIBUTOR      

What happened?

When aggregating a dataset over specified dimensions I don't expect variables which don't have those dimensions to be aggregated.

What did you expect to happen?

When a weighting is applied to the aggregation, variables which do not have the aggregation dimensions are nevertheless aggregated. Presumably because the weights get broadcast across those variables. Perhaps this is the intended behavior but it seems surprising to me and should at least be documented I think.

Minimal Complete Verifiable Example

```Python import xarray as xr import numpy as np

var1 = np.ones((2, 2, 3))

var2 = np.ones((3))

lon = np.arange(4).reshape(2, 2) lat = np.arange(4).reshape(2, 2)

ds = xr.Dataset( { "temperature": (["x", "y", "time"], var1), "precipitation": (["time"], var2), }, coords={ "lon": (["x", "y"], lon), "lat": (["x", "y"], lat), "time": np.arange(3), }, )

print(ds.sum(['x', 'y']))

Precipitation (with no x or y dimension) is not summed over, leading to values [1. 1. 1.]

print(ds.weighted(xr.ones_like(ds['temperature'])).sum(['x', 'y']))

Precipitation is now summed over, leading to values [4. 4. 4.]

```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:38:11) [Clang 14.0.6 ] python-bits: 64 OS: Darwin OS-release: 23.1.0 machine: arm64 processor: arm byteorder: little LC_ALL: None LANG: None LOCALE: (None, 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.1 xarray: 2023.3.0 pandas: 1.5.3 numpy: 1.23.5 scipy: 1.10.1 netCDF4: 1.6.3 pydap: None h5netcdf: None h5py: 3.8.0 Nio: None zarr: 2.14.2 cftime: 1.6.2 nc_time_axis: None PseudoNetCDF: None rasterio: 1.3.6 cfgrib: None iris: 3.4.1 bottleneck: None dask: 2023.3.2 distributed: 2023.3.2.1 matplotlib: 3.7.1 cartopy: 0.21.1 seaborn: 0.12.2 numbagg: None fsspec: 2023.10.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 67.6.1 pip: 23.0.1 conda: None pytest: None mypy: None IPython: 8.12.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8583/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 160.35ms · About: xarray-datasette