html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/3922#issuecomment-626185322,https://api.github.com/repos/pydata/xarray/issues/3922,626185322,MDEyOklzc3VlQ29tbWVudDYyNjE4NTMyMg==,5821660,2020-05-09T14:34:37Z,2020-05-09T14:34:37Z,MEMBER,"Thanks @dcherian for getting back to this. To my bad, this adventure went too far for my capabilities. Nevertheless I hope to catch up learning xarray inside. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-614453431,https://api.github.com/repos/pydata/xarray/issues/3922,614453431,MDEyOklzc3VlQ29tbWVudDYxNDQ1MzQzMQ==,5821660,2020-04-16T07:00:21Z,2020-04-16T07:00:21Z,MEMBER,"Error log of the compute error: https://dev.azure.com/xarray/xarray/_build/results?buildId=2629&view=logs&j=78b48a04-306f-5a15-9ac3-dd2fdb28db5e&t=5160aa4e-6217-5012-6424-4f17180b374b&l=412
``` xarray/tests/test_dask.py:52: RuntimeError _______ TestReduce2D.test_idxmax[True-x2-minindex2-maxindex2-nanindex2] ________ self = xarray/core/variable.py:1579: in reduce data = func(input_data, axis=axis, **kwargs) xarray/core/duck_array_ops.py:304: in f return func(values, axis=axis, **kwargs) xarray/core/nanops.py:104: in nanargmax return _nan_argminmax_object(""argmax"", fill_value, a, axis=axis) xarray/core/nanops.py:57: in _nan_argminmax_object if (valid_count == 0).any(): /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/array/core.py:1375: in __bool__ return bool(self.compute()) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/base.py:175: in compute (result,) = compute(self, traverse=False, **kwargs) /usr/share/miniconda/envs/xarray-tests/lib/python3.6/site-packages/dask/base.py:446: in compute results = schedule(dsk, keys, **kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = dsk = {('all-all-aggregate-any-getitem-invert-sum-sum-aggregate-any-aggregate-a1eb5c32ca3225c37fddde3d1aa0d2f0',): (Compose(... -4.0, 2.0], [-4.0, nan, 2.0, nan, -2.0, -4.0, 2.0], [nan, nan, nan, nan, nan, nan, nan]], dtype=object)} keys = [[('any-aggregate-a1eb5c32ca3225c37fddde3d1aa0d2f0',)]], kwargs = {} def __call__(self, dsk, keys, **kwargs): self.total_computes += 1 if self.total_computes > self.max_computes: raise RuntimeError( ""Too many computes. Total: %d > max: %d."" > % (self.total_computes, self.max_computes) ) E RuntimeError: Too many computes. Total: 1 > max: 0. ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-614445489,https://api.github.com/repos/pydata/xarray/issues/3922,614445489,MDEyOklzc3VlQ29tbWVudDYxNDQ0NTQ4OQ==,5821660,2020-04-16T06:39:13Z,2020-04-16T06:39:13Z,MEMBER,"@shoyer Thanks for the hint. I'm currently experimenting with the different possibilities. non-dask: ``` array([[ 0, 1, 2, 0, -2, -4, 2], [ 1, 1, 1, 1, 1, 1, 1], [ 0, 0, -10, 5, 20, 0, 0]]) Coordinates: * x (x) int64 0 4 8 12 16 20 24 * y (y) int64 1 0 -1 Attributes: attr1: value1 attr2: 2929 ``` dask: ``` dask.array, shape=(3, 7), dtype=int64, chunksize=(3, 7), chunktype=numpy.ndarray> Coordinates: * x (x) int64 0 4 8 12 16 20 24 * y (y) int64 1 0 -1 Attributes: attr1: value1 attr2: 2929 ``` The relevant code inside idxmin/idxmax: ```python # This will run argmin or argmax. # indx will be dask if array is dask indx = func(array, dim=dim, axis=None, keep_attrs=keep_attrs, skipna=skipna) # separated out for debugging # with the current test layout coords will not be dask since the array's coords are not dask coords = array[dim] # try to make dask of coords as per @shoyer's suggestion # the below fails silently, but cannot be forced even with trying to # do something like dask.array.asarray(), this errors out with # ""Cannot assign to the .data attribute of dimension coordinate a.k.a IndexVariable 'x'"" if isinstance(indx.data, dask_array_type): coords = coords.chunk({}) res = coords[(indx,)] ``` It seems that the map-blocks approach is the only one which seem to work throughout the tests, but one. It fails with array set as array.astype(""object"") in the test fixture. Reason: dask gets computed within argmin/argmax. I'll revert to the map-blocks now and add the `with raise_if_dask_computes()` , context where neccessary. Any hints appreciated for the dask compute-error. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-614038283,https://api.github.com/repos/pydata/xarray/issues/3922,614038283,MDEyOklzc3VlQ29tbWVudDYxNDAzODI4Mw==,5821660,2020-04-15T13:24:04Z,2020-04-15T13:24:04Z,MEMBER,"@dcherian Thanks for explaining the decorator a bit more. So it's indeed simpler than I thought. I'll revert to the `map_blocks` solution. I'll not have time today, so this will have to wait a bit.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-613840540,https://api.github.com/repos/pydata/xarray/issues/3922,613840540,MDEyOklzc3VlQ29tbWVudDYxMzg0MDU0MA==,5821660,2020-04-15T06:21:22Z,2020-04-15T06:21:22Z,MEMBER,"@dcherian I've tried to apply the `with raise_if_dask_computes()` context, but as you already mentioned, it is quite hard to incorporate into the tests. Every once in a while we would need to introduce some if/else clause to check for `use_dask` and also the computation count inside idxmin/idxmax varies based on source data layout (because of usage of argmin/argmax inside). I'm now totally unsure how to proceed from here. Any guidance very much appreciated. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-613550678,https://api.github.com/repos/pydata/xarray/issues/3922,613550678,MDEyOklzc3VlQ29tbWVudDYxMzU1MDY3OA==,5821660,2020-04-14T16:39:42Z,2020-04-14T16:39:42Z,MEMBER,"@dcherian Thanks for the suggestion with the dask compute context, I'll have a look the next day. Nevertheless, I've debugged locally and the res-output of the idxmin/idxmax holds dask data. Anyway, I'll revert to the former working status and leave a comment to this PR in the code. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-613531835,https://api.github.com/repos/pydata/xarray/issues/3922,613531835,MDEyOklzc3VlQ29tbWVudDYxMzUzMTgzNQ==,5821660,2020-04-14T16:05:00Z,2020-04-14T16:05:00Z,MEMBER,"@max-sixty No worries! @shoyer 's comment earlier today made me think about the it some more. It seems that a special dask handling is only needed for 1D arrays. But this would need some insight of a some more experienced than myself to find a solution to the current error. I hope that some other dev has an idea on how to proceed. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-613282017,https://api.github.com/repos/pydata/xarray/issues/3922,613282017,MDEyOklzc3VlQ29tbWVudDYxMzI4MjAxNw==,5821660,2020-04-14T07:48:13Z,2020-04-14T07:48:13Z,MEMBER,"After removing the special handling of dask arrays, it works fine for the 2D cases. Only the 1D cases fail, because the reduced dask index arrays (via argmin/argmax) fail by accessing the `.item()`. Any suggestions on how and where to resolve? We could check for zero dim dask and `.load()`, but I'm unsure if this is the correct way.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-613231709,https://api.github.com/repos/pydata/xarray/issues/3922,613231709,MDEyOklzc3VlQ29tbWVudDYxMzIzMTcwOQ==,5821660,2020-04-14T05:20:50Z,2020-04-14T05:20:50Z,MEMBER,Is there anything I can do to finalize this PR? Anyway I'm trying to keep this in a *no-conflict* state. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-609613126,https://api.github.com/repos/pydata/xarray/issues/3922,609613126,MDEyOklzc3VlQ29tbWVudDYwOTYxMzEyNg==,5821660,2020-04-06T07:27:09Z,2020-04-06T07:27:09Z,MEMBER,"This is ready from my end for final review. Should be merged before #3936, IMHO.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608437728,https://api.github.com/repos/pydata/xarray/issues/3922,608437728,MDEyOklzc3VlQ29tbWVudDYwODQzNzcyOA==,5821660,2020-04-03T13:36:57Z,2020-04-03T13:36:57Z,MEMBER,"Seems that everything goes well, besides the `upstream-dev` run. If this is ready for merge, should I extend the idxmin/idxmax section of whats-new.rst ? And how should I distribute the credit for all contributors @dcherian, @keewis, @max-sixty?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608404057,https://api.github.com/repos/pydata/xarray/issues/3922,608404057,MDEyOklzc3VlQ29tbWVudDYwODQwNDA1Nw==,5821660,2020-04-03T12:23:43Z,2020-04-03T12:23:43Z,MEMBER,"@keewis OK, how should I handle this? Shall we XFAIL the these tests then?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608371762,https://api.github.com/repos/pydata/xarray/issues/3922,608371762,MDEyOklzc3VlQ29tbWVudDYwODM3MTc2Mg==,5821660,2020-04-03T11:00:52Z,2020-04-03T11:00:52Z,MEMBER,"@keewis I've started by adding the dask tests to the existing `idxmin`/`idxmax`tests (test_dataarray). This works locally, but one tests (data is datetimes) fails for the dask test. Any immediate thought on this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608340435,https://api.github.com/repos/pydata/xarray/issues/3922,608340435,MDEyOklzc3VlQ29tbWVudDYwODM0MDQzNQ==,5821660,2020-04-03T09:44:27Z,2020-04-03T09:44:27Z,MEMBER,"@keewis I checked with my datasets, works like a charm. I'll try to add dask tests for this as @shoyer suggested. Where should these tests go? Currently the idxmax/idxmin tests are in test_dataarray and test_dataset: https://github.com/pydata/xarray/blob/b3bafeefbd6e6d70bce505ae1f0d9d5a2b015089/xarray/tests/test_dataarray.py#L4512 https://github.com/pydata/xarray/blob/b3bafeefbd6e6d70bce505ae1f0d9d5a2b015089/xarray/tests/test_dataarray.py#L4608 https://github.com/pydata/xarray/blob/1416d5ae475c0875e7a5d76fa4a8278838958162/xarray/tests/test_dataset.py#L4603-L4610 Any pointers?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608329686,https://api.github.com/repos/pydata/xarray/issues/3922,608329686,MDEyOklzc3VlQ29tbWVudDYwODMyOTY4Ng==,5821660,2020-04-03T09:21:42Z,2020-04-03T09:21:42Z,MEMBER,"@keewis Thanks a bunch for the explanation. Would we be on the safe side, if we use your proposed N-D example? It also works in the 2d-case.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-608244677,https://api.github.com/repos/pydata/xarray/issues/3922,608244677,MDEyOklzc3VlQ29tbWVudDYwODI0NDY3Nw==,5821660,2020-04-03T05:55:48Z,2020-04-03T05:56:46Z,MEMBER,"@max-sixty Thanks! I'll really appreciate your help. I've tracked the possible source down to a dimension problem. I've tried to create a minimal example as follows using the current `idxmax` implementation from above. I copied only the dask related lines of code: ```python # create dask backed 3d array darray = da.from_array(np.random.RandomState(0).randn(10*20*30).reshape(10, 20, 30), chunks=(10, 20, 30), name='data_arr') array = xr.DataArray(darray, dims=[""x"", ""y"", 'z']) array = array.assign_coords({'x': (['x'], np.arange(10)), 'y': (['y'], np.arange(20)), 'z': (['z'], np.arange(30)), }) func=lambda x, *args, **kwargs: x.argmax(*args, **kwargs) indx = func(array, dim='z', axis=None, keep_attrs=True, skipna=False) coordarray = array['z'] res = indx.copy( data=indx.data.map_blocks( lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype ) ) print(res) # the following line break breaks print(res.compute()) # using only 2dim array everything works as intended array2d = array.sel(y=0, drop=True) indx = func(array2d, dim='z', axis=None, keep_attrs=True, skipna=False) coordarray = array['z'] res = indx.copy( data=indx.data.map_blocks( lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype ) ) print(res) # this works for two dim data print(res.compute()) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988 https://github.com/pydata/xarray/pull/3922#issuecomment-607066537,https://api.github.com/repos/pydata/xarray/issues/3922,607066537,MDEyOklzc3VlQ29tbWVudDYwNzA2NjUzNw==,5821660,2020-04-01T06:45:35Z,2020-04-01T06:45:35Z,MEMBER,"@dcherian This seemed to work until computation: ``` IndexError: Unlabeled multi-dimensional array cannot be used for indexing: array_bin ``` where `array_bin` is the dimension over which `idxmax` is calculated. I tried to wrap my head around this to no avail. I keep trying but appreciate any hints... Full Traceback: ```python /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in compute(self, **kwargs) 839 """""" 840 new = self.copy(deep=False) --> 841 return new.load(**kwargs) 842 843 def persist(self, **kwargs) -> ""DataArray"": /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs) 813 dask.array.compute 814 """""" --> 815 ds = self._to_temp_dataset().load(**kwargs) 816 new = self._from_temp_dataset(ds) 817 self._variable = new._variable /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs) 654 655 # evaluate all the dask arrays simultaneously --> 656 evaluated_data = da.compute(*lazy_data.values(), **kwargs) 657 658 for k, data in zip(lazy_data, evaluated_data): /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs) 435 keys = [x.__dask_keys__() for x in collections] 436 postcomputes = [x.__dask_postcompute__() for x in collections] --> 437 results = schedule(dsk, keys, **kwargs) 438 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) 439 /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs) 74 pools[thread][num_workers] = pool 75 ---> 76 results = get_async( 77 pool.apply_async, 78 len(pool._pool), /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state[""cache""][key] = res /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb) 314 if exc.__traceback__ is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318 /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id)) /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/optimization.py in __call__(self, *args) 980 if not len(args) == len(self.inkeys): 981 raise ValueError(""Expected %d args, got %d"" % (len(self.inkeys), len(args))) --> 982 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args))) 983 984 def __reduce__(self): /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in get(dsk, out, cache) 149 for key in toposort(dsk): 150 task = dsk[key] --> 151 result = _execute_task(task, cache) 152 cache[key] = result 153 result = _execute_task(out, cache) /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/computation.py in (ind, coord) 1387 res = indx.copy( 1388 data=indx.data.map_blocks( -> 1389 lambda ind, coord: coord[(ind,)], coordarray, dtype=coordarray.dtype 1390 ) 1391 ) /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in __getitem__(self, key) 642 else: 643 # xarray-style array indexing --> 644 return self.isel(indexers=self._item_key_to_dict(key)) 645 646 def __setitem__(self, key: Any, value: Any) -> None: /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataarray.py in isel(self, indexers, drop, **indexers_kwargs) 1020 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, ""isel"") 1021 if any(is_fancy_indexer(idx) for idx in indexers.values()): -> 1022 ds = self._to_temp_dataset()._isel_fancy(indexers, drop=drop) 1023 return self._from_temp_dataset(ds) 1024 /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _isel_fancy(self, indexers, drop) 1962 # Note: we need to preserve the original indexers variable in order to merge the 1963 # coords below -> 1964 indexers_list = list(self._validate_indexers(indexers)) 1965 1966 variables: Dict[Hashable, Variable] = {} /home/kai/miniconda/envs/wradlib_38_01/lib/python3.8/site-packages/xarray/core/dataset.py in _validate_indexers(self, indexers) 1805 1806 if v.ndim > 1: -> 1807 raise IndexError( 1808 ""Unlabeled multi-dimensional array cannot be "" 1809 ""used for indexing: {}"".format(k) IndexError: Unlabeled multi-dimensional array cannot be used for indexing: array_bin ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,591101988