home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 895842334

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
895842334 MDU6SXNzdWU4OTU4NDIzMzQ= 5346 Fast-track unstack doesn't work with dask 20629530 closed 0     6 2021-05-19T20:14:26Z 2021-05-26T07:07:17Z 2021-05-26T07:07:17Z CONTRIBUTOR      

What happened: Using unstack on data with the dask backend fails with a dask error.

What you expected to happen: No failure, as with xarray 0.18.0 and earlier.

Minimal Complete Verifiable Example:

```python import pandas as pd import xarray as xr

da = xr.DataArray([1] * 4, dims=('x',), coords={'x': [1, 2, 3, 4]}) dac = da.chunk()

ind = pd.MultiIndex.from_arrays(([0, 0, 1, 1], [0, 1, 0, 1]), names=("y", "z")) dac.assign_coords(x=ind).unstack("x")") Fails with:python


NotImplementedError Traceback (most recent call last) <ipython-input-4-3c317738ec05> in <module> 3 4 ind = pd.MultiIndex.from_arrays(([0, 0, 1, 1], [0, 1, 0, 1]), names=("y", "z")) ----> 5 dac.assign_coords(x=ind).unstack("x")

~/Python/myxarray/xarray/core/dataarray.py in unstack(self, dim, fill_value, sparse) 2133 DataArray.stack 2134 """ -> 2135 ds = self._to_temp_dataset().unstack(dim, fill_value, sparse) 2136 return self._from_temp_dataset(ds) 2137

~/Python/myxarray/xarray/core/dataset.py in unstack(self, dim, fill_value, sparse) 4038 ): 4039 # Fast unstacking path: -> 4040 result = result._unstack_once(dim, fill_value) 4041 else: 4042 # Slower unstacking path, examples of array types that

~/Python/myxarray/xarray/core/dataset.py in unstack_once(self, dim, fill_value) 3914 fill_value = fill_value 3915 -> 3916 variables[name] = var.unstack_once( 3917 index=index, dim=dim, fill_value=fill_value 3918 )

~/Python/myxarray/xarray/core/variable.py in _unstack_once(self, index, dim, fill_value) 1605 # sparse doesn't support item assigment, 1606 # https://github.com/pydata/sparse/issues/114 -> 1607 data[(..., *indexer)] = reordered 1608 1609 return self._replace(dims=new_dims, data=data)

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/core.py in setitem(self, key, value) 1693 1694 out = "setitem-" + tokenize(self, key, value) -> 1695 dsk = setitem_array(out, self, key, value) 1696 1697 graph = HighLevelGraph.from_collections(out, dsk, dependencies=[self])

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/slicing.py in setitem_array(out_name, array, indices, value) 1787 1788 # Reformat input indices -> 1789 indices, indices_shape, reverse = parse_assignment_indices(indices, array_shape) 1790 1791 # Empty slices can only be assigned size 1 values

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/slicing.py in parse_assignment_indices(indices, shape) 1476 n_lists += 1 1477 if n_lists > 1: -> 1478 raise NotImplementedError( 1479 "dask is currently limited to at most one " 1480 "dimension's assignment index being a "

NotImplementedError: dask is currently limited to at most one dimension's assignment index being a 1-d array of integers or booleans. Got: (Ellipsis, array([0, 0, 1, 1], dtype=int8), array([0, 1, 0, 1], dtype=int8)) ``` The example works when I go back to xarray 0.18.0.

Anything else we need to know?: I saw no tests in "test_daraarray.py" and "test_dataset.py" for unstack+dask, but they might be elsewhere? If #5315 was successful, maybe there is something specific in my example and config that is causing the error? @max-sixty @Illviljan

Proposed test, for "test_dataset.py", adapted copy of test_unstack: python @requires_dask def test_unstack_dask(self): index = pd.MultiIndex.from_product([[0, 1], ["a", "b"]], names=["x", "y"]) ds = Dataset({"b": ("z", [0, 1, 2, 3]), "z": index}).chunk() expected = Dataset( {"b": (("x", "y"), [[0, 1], [2, 3]]), "x": [0, 1], "y": ["a", "b"]} ) for dim in ["z", ["z"], None]: actual = ds.unstack(dim).load() assert_identical(actual, expected)

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.11.16-arch1-1 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_CA.utf8 LOCALE: ('fr_CA', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.18.2.dev2+g6d2a7301 pandas: 1.2.4 numpy: 1.20.2 scipy: 1.6.3 netCDF4: 1.5.6 pydap: installed h5netcdf: 0.11.0 h5py: 3.2.1 Nio: None zarr: 2.8.1 cftime: 1.4.1 nc_time_axis: 1.2.0 PseudoNetCDF: installed rasterio: 1.2.2 cfgrib: 0.9.9.0 iris: 2.4.0 bottleneck: 1.3.2 dask: 2021.05.0 distributed: 2021.05.0 matplotlib: 3.4.1 cartopy: 0.19.0 seaborn: 0.11.1 numbagg: installed pint: 0.17 setuptools: 49.6.0.post20210108 pip: 21.1 conda: None pytest: 6.2.3 IPython: 7.22.0 sphinx: 3.5.4
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5346/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 6 rows from issue in issue_comments
Powered by Datasette · Queries took 1.022ms · About: xarray-datasette