home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

1 row where state = "open", type = "issue" and user = 3383837 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

type 1

  • issue · 1 ✖

state 1

  • open · 1 ✖

repo 1

  • xarray 1
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2133533727 I_kwDOAMm_X85_KyQf 8745 map_blocks raises AssertionError given chunks along a multiindex itcarroll 3383837 open 0     0 2024-02-14T04:48:41Z 2024-02-26T05:59:34Z   CONTRIBUTOR      

What happened?

To parallelize a computation over a sparse array, I want to create chunks after stacking the array and dropping fill values. Such an array has a multiindex, which breaks the map_blocks method (with an unhelpful error). The AssertionError raised suggests to me that the method does not account for such an array having (coordinate) variables that are not dimensions or a scalar.

What did you expect to happen?

I expect the MCVE below to return its input unmodified.

Minimal Complete Verifiable Example

```Python import xarray as xr

a = xr.DataArray([[0, 1], [2, 3]], {"x": [0, 1], "y": [0, 1]}) a = a.stack({"n": ("x", "y")}) a = a.chunk({"n": 2}) a.map_blocks(lambda x: x) ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
  • [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

```Python File ~/tmp/bug/venv/lib/python3.11/site-packages/xarray/core/dataarray.py:5526, in DataArray.map_blocks(self, func, args, kwargs, template) 5428 """ 5429 Apply a function to each block of this DataArray. 5430 (...) 5522 month (time) int64 dask.array<chunksize=(24,), meta=np.ndarray> 5523 """ 5524 from xarray.core.parallel import map_blocks -> 5526 return map_blocks(func, self, args, kwargs, template)

File ~/tmp/bug/venv/lib/python3.11/site-packages/xarray/core/parallel.py:539, in map_blocks(func, obj, args, kwargs, template) 535 for chunk_tuple in itertools.product(*ichunk.values()): 536 # mapping from dimension name to chunk index 537 chunk_index = dict(zip(ichunk.keys(), chunk_tuple)) --> 539 blocked_args = [ 540 subset_dataset_to_block(graph, gname, arg, input_chunk_bounds, chunk_index) 541 if isxr 542 else arg 543 for isxr, arg in zip(is_xarray, npargs) 544 ] 546 # raise nice error messages in _wrapper 547 expected: ExpectedDict = { 548 # input chunk 0 along a dimension maps to output chunk 0 along the same dimension 549 # even if length of dimension is changed by the applied function (...) 563 }, 564 }

File ~/tmp/bug/venv/lib/python3.11/site-packages/xarray/core/parallel.py:540, in <listcomp>(.0) 535 for chunk_tuple in itertools.product(*ichunk.values()): 536 # mapping from dimension name to chunk index 537 chunk_index = dict(zip(ichunk.keys(), chunk_tuple)) 539 blocked_args = [ --> 540 subset_dataset_to_block(graph, gname, arg, input_chunk_bounds, chunk_index) 541 if isxr 542 else arg 543 for isxr, arg in zip(is_xarray, npargs) 544 ] 546 # raise nice error messages in _wrapper 547 expected: ExpectedDict = { 548 # input chunk 0 along a dimension maps to output chunk 0 along the same dimension 549 # even if length of dimension is changed by the applied function (...) 563 }, 564 }

File ~/tmp/bug/venv/lib/python3.11/site-packages/xarray/core/parallel.py:195, in subset_dataset_to_block(graph, gname, dataset, input_chunk_bounds, chunk_index) 190 graph[chunk_variable_task] = ( 191 tuple, 192 [variable.dims, chunk, variable.attrs], 193 ) 194 else: --> 195 assert name in dataset.dims or variable.ndim == 0 197 # non-dask array possibly with dimensions chunked on other variables 198 # index into variable appropriately 199 subsetter = { 200 dim: _get_chunk_slicer(dim, chunk_index, input_chunk_bounds) 201 for dim in variable.dims 202 }

AssertionError:

```

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.11.6 (main, Nov 2 2023, 04:52:24) [Clang 14.0.3 (clang-1403.0.22.14.1)] python-bits: 64 OS: Darwin OS-release: 22.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.3-development xarray: 2024.1.1 pandas: 2.2.0 numpy: 1.26.4 scipy: 1.11.4 netCDF4: 1.6.5 pydap: None h5netcdf: 1.3.0 h5py: 3.10.0 Nio: None zarr: 2.16.1 cftime: 1.6.3 nc_time_axis: None iris: None bottleneck: None dask: 2024.2.0 distributed: None matplotlib: 3.8.2 cartopy: 0.22.0 seaborn: None numbagg: None fsspec: 2024.2.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 68.2.2 pip: 24.0 conda: None pytest: None mypy: None IPython: 8.18.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8745/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 647.035ms · About: xarray-datasette