home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1642299599

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1642299599 I_kwDOAMm_X85h44DP 7683 automatically chunk in groupby binary ops 2448579 closed 0     0 2023-03-27T15:14:09Z 2023-07-27T16:41:35Z 2023-07-27T16:41:34Z MEMBER      

What happened?

From https://discourse.pangeo.io/t/xarray-unable-to-allocate-memory-how-to-size-up-problem/3233/4

Consider ``` python

ds is dataset with big dask arrays

mean = ds.groupby("time.day").mean() mean.to_netcdf() mean = xr.open_dataset(...)

ds.groupby("time.day") - mean ```

In GroupBy._binary_op https://github.com/pydata/xarray/blob/39caafae4452f5327a7cd671b18d4bb3eb3785ba/xarray/core/groupby.py#L616

we will eagerly construct other that is of the same size as ds.

What did you expect to happen?

I think the only solution is to automatically chunk if ds has dask arrays, and other (or mean) isn't backed by dask arrays. A chunk size of 1 seems sensible.

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7683/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.533ms · About: xarray-datasette