issues: 733077617

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
733077617	MDU6SXNzdWU3MzMwNzc2MTc=	4555	Vectorized indexing (isel) of chunked data with 1D indices gives weird chunks	4160723	open	0			1	2020-10-30T10:55:33Z	2021-03-02T17:36:48Z		MEMBER				What happened: Applying `.isel()` on a DataArray or Dataset with chunked data using 1-d indices (either stored in a `xarray.Variable` or a `numpy.ndarray`) gives weird chunks (i.e., a lot of chunks with small sizes). What you expected to happen: More consistent chunk sizes. Minimal Complete Verifiable Example: Let's create a chunked DataArray ```python In [1]: import numpy as np In [2]: import xarray as xr In [3]: da = xr.DataArray(np.random.rand(100), dims='points').chunk(50) In [4]: da Out[4]: <xarray.DataArray (points: 100)> dask.array<xarray-\<this-array>, shape=(100,), dtype=float64, chunksize=(50,), chunktype=numpy.ndarray> Dimensions without coordinates: points ``` Select random indices results in a lot of small chunks ```python In [5]: indices = xr.Variable('nodes', np.random.choice(np.arange(100, dtype='int'), size=10)) In [6]: da_sel = da.isel(points=indices) In [7]: da_sel.chunks Out[7]: ((1, 1, 3, 1, 1, 3),) ``` What I would expect `python In [8]: da.data.vindex[indices.data].chunks Out[8]: ((10,),)` This works fine with 2+ dimensional indexers, e.g., ```python In [9]: indices_2d = xr.Variable(('x', 'y'), np.random.choice(np.arange(100), size=(10, 10))) In [10]: da_sel_2d = da.isel(points=indices_2d) In [11]: da_sel_2d.chunks Out[11]: ((10,), (10,)) ``` Anything else we need to know?: I suspect the issue is here: https://github.com/pydata/xarray/blob/063606b90946d869e90a6273e2e18ed24bffb052/xarray/core/variable.py#L616-L617 In the example above I think we still want vectorized indexing (i.e., call `dask.array.Array.vindex[]` instead of `dask.array.Array[]`). Environment: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.3 \| packaged by conda-forge \| (default, Jun 1 2020, 17:21:09) [Clang 9.0.1 ] python-bits: 64 OS: Darwin OS-release: 18.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.UTF-8 libhdf5: None libnetcdf: None xarray: 0.16.1 pandas: 1.1.3 numpy: 1.19.1 scipy: 1.5.2 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.19.0 distributed: 2.25.0 matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 47.3.1.post20200616 pip: 20.1.1 conda: None pytest: 5.4.3 IPython: 7.16.1 sphinx: 3.2.1	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4555/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

2 rows from issues_id in issues_labels
1 row from issue in issue_comments