issues: 759709924

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
759709924	MDU6SXNzdWU3NTk3MDk5MjQ=	4663	Fancy indexing a Dataset with dask DataArray triggers multiple computes	6130352	closed	0			8	2020-12-08T19:17:08Z	2023-03-15T02:48:01Z	2023-03-15T02:48:01Z	NONE				It appears that boolean arrays (or any slicing array presumably) are evaluated many more times than necessary when applied to multiple variables in a Dataset. Is this intentional? Here is an example that demonstrates this: ```python Use a custom array type to know when data is being evaluated class Array(): `def __init__(self, x): self.shape = (x.shape[0],) self.ndim = x.ndim self.dtype = 'bool' self.x = x def __getitem__(self, idx): if idx[0].stop > 0: print('Evaluating') return (self.x > .5).__getitem__(idx)` Control case -- this shows that the print statement is only reached once da.from_array(Array(np.random.rand(100))).compute(); Evaluating This usage somehow results in two evaluations of this one array? ds = xr.Dataset(dict( a=('x', da.from_array(Array(np.random.rand(100)))) )) ds.sel(x=ds.a) Evaluating Evaluating <xarray.Dataset> Dimensions: (x: 51) Dimensions without coordinates: x Data variables: a (x) bool dask.array<chunksize=(51,), meta=np.ndarray> The array is evaluated an extra time for each new variable ds = xr.Dataset(dict( a=('x', da.from_array(Array(np.random.rand(100)))), b=(('x', 'y'), da.random.random((100, 10))), c=(('x', 'y'), da.random.random((100, 10))), d=(('x', 'y'), da.random.random((100, 10))), )) ds.sel(x=ds.a) Evaluating Evaluating Evaluating Evaluating Evaluating <xarray.Dataset> Dimensions: (x: 48, y: 10) Dimensions without coordinates: x, y Data variables: a (x) bool dask.array<chunksize=(48,), meta=np.ndarray> b (x, y) float64 dask.array<chunksize=(48, 10), meta=np.ndarray> c (x, y) float64 dask.array<chunksize=(48, 10), meta=np.ndarray> d (x, y) float64 dask.array<chunksize=(48, 10), meta=np.ndarray> ``` Given that slicing is already not lazy, why does the same predicate array need to be computed more than once? @tomwhite originally pointed this out in https://github.com/pystatgen/sgkit/issues/299.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/4663/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

3 rows from issues_id in issues_labels
8 rows from issue in issue_comments