html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/3922#issuecomment-626185322,https://api.github.com/repos/pydata/xarray/issues/3922,626185322,MDEyOklzc3VlQ29tbWVudDYyNjE4NTMyMg==,5821660,2020-05-09T14:34:37Z,2020-05-09T14:34:37Z,MEMBER,"Thanks @dcherian for getting back to this. To my bad, this adventure went too far for my capabilities. Nevertheless I hope to catch up learning xarray inside. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-614453431,https://api.github.com/repos/pydata/xarray/issues/3922,614453431,MDEyOklzc3VlQ29tbWVudDYxNDQ1MzQzMQ==,5821660,2020-04-16T07:00:21Z,2020-04-16T07:00:21Z,MEMBER,"Error log of the compute error:
https://dev.azure.com/xarray/xarray/_build/results?buildId=2629&view=logs&j=78b48a04-306f-5a15-9ac3-dd2fdb28db5e&t=5160aa4e-6217-5012-6424-4f17180b374b&l=412
```
xarray/tests/test_dask.py:52: RuntimeError
_______ TestReduce2D.test_idxmax[True-x2-minindex2-maxindex2-nanindex2] ________
self =
xarray/core/variable.py:1579: in reduce
data = func(input_data, axis=axis, **kwargs)
xarray/core/duck_array_ops.py:304: in f
return func(values, axis=axis, **kwargs)
xarray/core/nanops.py:104: in nanargmax
return _nan_argminmax_object(""argmax"", fill_value, a, axis=axis)
xarray/core/nanops.py:57: in _nan_argminmax_object
if (valid_count == 0).any():
/usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:1375: in __bool__
return bool(self.compute())
/usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/base.py:175: in compute
(result,) = compute(self, traverse=False, **kwargs)
/usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/base.py:446: in compute
results = schedule(dsk, keys, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self =
dsk = {('all-all-aggregate-any-getitem-invert-sum-sum-aggregate-any-aggregate-a1eb5c32ca3225c37fddde3d1aa0d2f0',): (Compose(... -4.0, 2.0],
[-4.0, nan, 2.0, nan, -2.0, -4.0, 2.0],
[nan, nan, nan, nan, nan, nan, nan]], dtype=object)}
keys = [[('any-aggregate-a1eb5c32ca3225c37fddde3d1aa0d2f0',)]], kwargs = {}
def __call__(self, dsk, keys, **kwargs):
self.total_computes += 1
if self.total_computes > self.max_computes:
raise RuntimeError(
""Too many computes. Total: %d > max: %d.""
> % (self.total_computes, self.max_computes)
)
E RuntimeError: Too many computes. Total: 1 > max: 0.
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-614445489,https://api.github.com/repos/pydata/xarray/issues/3922,614445489,MDEyOklzc3VlQ29tbWVudDYxNDQ0NTQ4OQ==,5821660,2020-04-16T06:39:13Z,2020-04-16T06:39:13Z,MEMBER,"@shoyer Thanks for the hint.
I'm currently experimenting with the different possibilities.
non-dask:
```
array([[ 0, 1, 2, 0, -2, -4, 2],
[ 1, 1, 1, 1, 1, 1, 1],
[ 0, 0, -10, 5, 20, 0, 0]])
Coordinates:
* x (x) int64 0 4 8 12 16 20 24
* y (y) int64 1 0 -1
Attributes:
attr1: value1
attr2: 2929
```
dask:
```
dask.array, shape=(3, 7), dtype=int64, chunksize=(3, 7), chunktype=numpy.ndarray>
Coordinates:
* x (x) int64 0 4 8 12 16 20 24
* y (y) int64 1 0 -1
Attributes:
attr1: value1
attr2: 2929
```
The relevant code inside idxmin/idxmax:
```python
# This will run argmin or argmax.
# indx will be dask if array is dask
indx = func(array, dim=dim, axis=None, keep_attrs=keep_attrs, skipna=skipna)
# separated out for debugging
# with the current test layout coords will not be dask since the array's coords are not dask
coords = array[dim]
# try to make dask of coords as per @shoyer's suggestion
# the below fails silently, but cannot be forced even with trying to
# do something like dask.array.asarray(), this errors out with
# ""Cannot assign to the .data attribute of dimension coordinate a.k.a IndexVariable 'x'""
if isinstance(indx.data, dask_array_type):
coords = coords.chunk({})
res = coords[(indx,)]
```
It seems that the map-blocks approach is the only one which seem to work throughout the tests, but one. It fails with array set as array.astype(""object"") in the test fixture. Reason: dask gets computed within argmin/argmax.
I'll revert to the map-blocks now and add the `with raise_if_dask_computes()` , context where neccessary. Any hints appreciated for the dask compute-error.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-614038283,https://api.github.com/repos/pydata/xarray/issues/3922,614038283,MDEyOklzc3VlQ29tbWVudDYxNDAzODI4Mw==,5821660,2020-04-15T13:24:04Z,2020-04-15T13:24:04Z,MEMBER,"@dcherian Thanks for explaining the decorator a bit more. So it's indeed simpler than I thought. I'll revert to the `map_blocks` solution. I'll not have time today, so this will have to wait a bit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-613840540,https://api.github.com/repos/pydata/xarray/issues/3922,613840540,MDEyOklzc3VlQ29tbWVudDYxMzg0MDU0MA==,5821660,2020-04-15T06:21:22Z,2020-04-15T06:21:22Z,MEMBER,"@dcherian I've tried to apply the `with raise_if_dask_computes()` context, but as you already mentioned, it is quite hard to incorporate into the tests. Every once in a while we would need to introduce some if/else clause to check for `use_dask` and also the computation count inside idxmin/idxmax varies based on source data layout (because of usage of argmin/argmax inside).
I'm now totally unsure how to proceed from here. Any guidance very much appreciated.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-613550678,https://api.github.com/repos/pydata/xarray/issues/3922,613550678,MDEyOklzc3VlQ29tbWVudDYxMzU1MDY3OA==,5821660,2020-04-14T16:39:42Z,2020-04-14T16:39:42Z,MEMBER,"@dcherian Thanks for the suggestion with the dask compute context, I'll have a look the next day.
Nevertheless, I've debugged locally and the res-output of the idxmin/idxmax holds dask data.
Anyway, I'll revert to the former working status and leave a comment to this PR in the code. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-613531835,https://api.github.com/repos/pydata/xarray/issues/3922,613531835,MDEyOklzc3VlQ29tbWVudDYxMzUzMTgzNQ==,5821660,2020-04-14T16:05:00Z,2020-04-14T16:05:00Z,MEMBER,"@max-sixty No worries!
@shoyer 's comment earlier today made me think about the it some more. It seems that a special dask handling is only needed for 1D arrays. But this would need some insight of a some more experienced than myself to find a solution to the current error.
I hope that some other dev has an idea on how to proceed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-613282017,https://api.github.com/repos/pydata/xarray/issues/3922,613282017,MDEyOklzc3VlQ29tbWVudDYxMzI4MjAxNw==,5821660,2020-04-14T07:48:13Z,2020-04-14T07:48:13Z,MEMBER,"After removing the special handling of dask arrays, it works fine for the 2D cases. Only the 1D cases fail, because the reduced dask index arrays (via argmin/argmax) fail by accessing the `.item()`.
Any suggestions on how and where to resolve? We could check for zero dim dask and `.load()`, but I'm unsure if this is the correct way.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-613231709,https://api.github.com/repos/pydata/xarray/issues/3922,613231709,MDEyOklzc3VlQ29tbWVudDYxMzIzMTcwOQ==,5821660,2020-04-14T05:20:50Z,2020-04-14T05:20:50Z,MEMBER,Is there anything I can do to finalize this PR? Anyway I'm trying to keep this in a *no-conflict* state. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-609613126,https://api.github.com/repos/pydata/xarray/issues/3922,609613126,MDEyOklzc3VlQ29tbWVudDYwOTYxMzEyNg==,5821660,2020-04-06T07:27:09Z,2020-04-06T07:27:09Z,MEMBER,"This is ready from my end for final review.
Should be merged before #3936, IMHO.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608437728,https://api.github.com/repos/pydata/xarray/issues/3922,608437728,MDEyOklzc3VlQ29tbWVudDYwODQzNzcyOA==,5821660,2020-04-03T13:36:57Z,2020-04-03T13:36:57Z,MEMBER,"Seems that everything goes well, besides the `upstream-dev` run.
If this is ready for merge, should I extend the idxmin/idxmax section of whats-new.rst ? And how should I distribute the credit for all contributors @dcherian, @keewis, @max-sixty?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608404057,https://api.github.com/repos/pydata/xarray/issues/3922,608404057,MDEyOklzc3VlQ29tbWVudDYwODQwNDA1Nw==,5821660,2020-04-03T12:23:43Z,2020-04-03T12:23:43Z,MEMBER,"@keewis OK, how should I handle this? Shall we XFAIL the these tests then?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608371762,https://api.github.com/repos/pydata/xarray/issues/3922,608371762,MDEyOklzc3VlQ29tbWVudDYwODM3MTc2Mg==,5821660,2020-04-03T11:00:52Z,2020-04-03T11:00:52Z,MEMBER,"@keewis I've started by adding the dask tests to the existing `idxmin`/`idxmax`tests (test_dataarray). This works locally, but one tests (data is datetimes) fails for the dask test. Any immediate thought on this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608340435,https://api.github.com/repos/pydata/xarray/issues/3922,608340435,MDEyOklzc3VlQ29tbWVudDYwODM0MDQzNQ==,5821660,2020-04-03T09:44:27Z,2020-04-03T09:44:27Z,MEMBER,"@keewis I checked with my datasets, works like a charm. I'll try to add dask tests for this as @shoyer suggested. Where should these tests go? Currently the idxmax/idxmin tests are in test_dataarray and test_dataset:
https://github.com/pydata/xarray/blob/b3bafeefbd6e6d70bce505ae1f0d9d5a2b015089/xarray/tests/test_dataarray.py#L4512
https://github.com/pydata/xarray/blob/b3bafeefbd6e6d70bce505ae1f0d9d5a2b015089/xarray/tests/test_dataarray.py#L4608
https://github.com/pydata/xarray/blob/1416d5ae475c0875e7a5d76fa4a8278838958162/xarray/tests/test_dataset.py#L4603-L4610
Any pointers?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608329686,https://api.github.com/repos/pydata/xarray/issues/3922,608329686,MDEyOklzc3VlQ29tbWVudDYwODMyOTY4Ng==,5821660,2020-04-03T09:21:42Z,2020-04-03T09:21:42Z,MEMBER,"@keewis Thanks a bunch for the explanation. Would we be on the safe side, if we use your proposed N-D example? It also works in the 2d-case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-608244677,https://api.github.com/repos/pydata/xarray/issues/3922,608244677,MDEyOklzc3VlQ29tbWVudDYwODI0NDY3Nw==,5821660,2020-04-03T05:55:48Z,2020-04-03T05:56:46Z,MEMBER,"@max-sixty Thanks! I'll really appreciate your help. I've tracked the possible source down to a dimension problem. I've tried to create a minimal example as follows using the current `idxmax` implementation from above. I copied only the dask related lines of code:
```python
# create dask backed 3d array
darray = da.from_array(np.random.RandomState(0).randn(10*20*30).reshape(10, 20, 30), chunks=(10, 20, 30), name='data_arr')
array = xr.DataArray(darray, dims=[""x"", ""y"", 'z'])
array = array.assign_coords({'x': (['x'], np.arange(10)),
'y': (['y'], np.arange(20)),
'z': (['z'], np.arange(30)),
})
func=lambda x, *args, **kwargs: x.argmax(*args, **kwargs)
indx = func(array, dim='z', axis=None, keep_attrs=True, skipna=False)
coordarray = array['z']
res = indx.copy(
data=indx.data.map_blocks(
lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
)
)
print(res)
# the following line break breaks
print(res.compute())
# using only 2dim array everything works as intended
array2d = array.sel(y=0, drop=True)
indx = func(array2d, dim='z', axis=None, keep_attrs=True, skipna=False)
coordarray = array['z']
res = indx.copy(
data=indx.data.map_blocks(
lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
)
)
print(res)
# this works for two dim data
print(res.compute())
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988
https://github.com/pydata/xarray/pull/3922#issuecomment-607066537,https://api.github.com/repos/pydata/xarray/issues/3922,607066537,MDEyOklzc3VlQ29tbWVudDYwNzA2NjUzNw==,5821660,2020-04-01T06:45:35Z,2020-04-01T06:45:35Z,MEMBER,"@dcherian This seemed to work until computation:
```
IndexError: Unlabeled multi-dimensional array cannot be used for indexing: array_bin
```
where `array_bin` is the dimension over which `idxmax` is calculated. I tried to wrap my head around this to no avail. I keep trying but appreciate any hints...
Full Traceback:
```python
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in compute(self, **kwargs)
839 """"""
840 new = self.copy(deep=False)
--> 841 return new.load(**kwargs)
842
843 def persist(self, **kwargs) -> ""DataArray"":
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs)
813 dask.array.compute
814 """"""
--> 815 ds = self._to_temp_dataset().load(**kwargs)
816 new = self._from_temp_dataset(ds)
817 self._variable = new._variable
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs)
654
655 # evaluate all the dask arrays simultaneously
--> 656 evaluated_data = da.compute(*lazy_data.values(), **kwargs)
657
658 for k, data in zip(lazy_data, evaluated_data):
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs)
435 keys = [x.__dask_keys__() for x in collections]
436 postcomputes = [x.__dask_postcompute__() for x in collections]
--> 437 results = schedule(dsk, keys, **kwargs)
438 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
439
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
74 pools[thread][num_workers] = pool
75
---> 76 results = get_async(
77 pool.apply_async,
78 len(pool._pool),
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
484 _execute_task(task, data) # Re-execute locally
485 else:
--> 486 raise_exception(exc, tb)
487 res, worker_id = loads(res_info)
488 state[""cache""][key] = res
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb)
314 if exc.__traceback__ is not tb:
315 raise exc.with_traceback(tb)
--> 316 raise exc
317
318
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
220 try:
221 task, data = loads(task_info)
--> 222 result = _execute_task(task, data)
223 id = get_id()
224 result = dumps((result, id))
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
119 # temporaries by their reference count and can execute certain
120 # operations in-place.
--> 121 return func(*(_execute_task(a, cache) for a in args))
122 elif not ishashable(arg):
123 return arg
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/optimization.py in __call__(self, *args)
980 if not len(args) == len(self.inkeys):
981 raise ValueError(""Expected %d args, got %d"" % (len(self.inkeys), len(args)))
--> 982 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
983
984 def __reduce__(self):
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in get(dsk, out, cache)
149 for key in toposort(dsk):
150 task = dsk[key]
--> 151 result = _execute_task(task, cache)
152 cache[key] = result
153 result = _execute_task(out, cache)
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
119 # temporaries by their reference count and can execute certain
120 # operations in-place.
--> 121 return func(*(_execute_task(a, cache) for a in args))
122 elif not ishashable(arg):
123 return arg
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/computation.py in (ind, coord)
1387 res = indx.copy(
1388 data=indx.data.map_blocks(
-> 1389 lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype
1390 )
1391 )
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in __getitem__(self, key)
642 else:
643 # xarray-style array indexing
--> 644 return self.isel(indexers=self._item_key_to_dict(key))
645
646 def __setitem__(self, key: Any, value: Any) -> None:
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in isel(self, indexers, drop, **indexers_kwargs)
1020 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, ""isel"")
1021 if any(is_fancy_indexer(idx) for idx in indexers.values()):
-> 1022 ds = self._to_temp_dataset()._isel_fancy(indexers, drop=drop)
1023 return self._from_temp_dataset(ds)
1024
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _isel_fancy(self, indexers, drop)
1962 # Note: we need to preserve the original indexers variable in order to merge the
1963 # coords below
-> 1964 indexers_list = list(self._validate_indexers(indexers))
1965
1966 variables: Dict[Hashable, Variable] = {}
/home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _validate_indexers(self, indexers)
1805
1806 if v.ndim > 1:
-> 1807 raise IndexError(
1808 ""Unlabeled multi-dimensional array cannot be ""
1809 ""used for indexing: {}"".format(k)
IndexError: Unlabeled multi-dimensional array cannot be used for indexing: array_bin
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988