html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3698#issuecomment-690378323,https://api.github.com/repos/pydata/xarray/issues/3698,690378323,MDEyOklzc3VlQ29tbWVudDY5MDM3ODMyMw==,1312546,2020-09-10T15:42:54Z,2020-09-10T15:42:54Z,MEMBER,"Thanks for confirming. I'll take another look at this today then. On Thu, Sep 10, 2020 at 10:30 AM Deepak Cherian wrote: > Reopened #3698 . > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524 https://github.com/pydata/xarray/issues/3698#issuecomment-690367604,https://api.github.com/repos/pydata/xarray/issues/3698,690367604,MDEyOklzc3VlQ29tbWVudDY5MDM2NzYwNA==,2448579,2020-09-10T15:30:01Z,2020-09-10T15:30:01Z,MEMBER,"The numpy example is fixed but the dask rechunked example is still broken. ``` python a = dask.array.ones((10,5), chunks=(1,3)) dask.optimize(xr.DataArray(a))[0].compute() # works dask.optimize(xr.DataArray(a).chunk(5))[0].compute() # error ``` ``` IndexError Traceback (most recent call last) in ----> 1 dask.optimize(xr.DataArray(a).chunk(5))[0].compute() ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/xarray/core/dataarray.py in compute(self, **kwargs) 838 """""" 839 new = self.copy(deep=False) --> 840 return new.load(**kwargs) 841 842 def persist(self, **kwargs) -> ""DataArray"": ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs) 812 dask.array.compute 813 """""" --> 814 ds = self._to_temp_dataset().load(**kwargs) 815 new = self._from_temp_dataset(ds) 816 self._variable = new._variable ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs) 656 657 # evaluate all the dask arrays simultaneously --> 658 evaluated_data = da.compute(*lazy_data.values(), **kwargs) 659 660 for k, data in zip(lazy_data, evaluated_data): ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs) 445 postcomputes.append(x.__dask_postcompute__()) 446 --> 447 results = schedule(dsk, keys, **kwargs) 448 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)]) 449 ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs) 74 pools[thread][num_workers] = pool 75 ---> 76 results = get_async( 77 pool.apply_async, 78 len(pool._pool), ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs) 484 _execute_task(task, data) # Re-execute locally 485 else: --> 486 raise_exception(exc, tb) 487 res, worker_id = loads(res_info) 488 state[""cache""][key] = res ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb) 314 if exc.__traceback__ is not tb: 315 raise exc.with_traceback(tb) --> 316 raise exc 317 318 ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception) 220 try: 221 task, data = loads(task_info) --> 222 result = _execute_task(task, data) 223 id = get_id() 224 result = dumps((result, id)) ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk) 119 # temporaries by their reference count and can execute certain 120 # operations in-place. --> 121 return func(*(_execute_task(a, cache) for a in args)) 122 elif not ishashable(arg): 123 return arg ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/array/core.py in concatenate3(arrays) 4407 if not ndim: 4408 return arrays -> 4409 chunks = chunks_from_arrays(arrays) 4410 shape = tuple(map(sum, chunks)) 4411 ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/array/core.py in chunks_from_arrays(arrays) 4178 4179 while isinstance(arrays, (list, tuple)): -> 4180 result.append(tuple([shape(deepfirst(a))[dim] for a in arrays])) 4181 arrays = arrays[0] 4182 dim += 1 ~/miniconda3/envs/dcpy/lib/python3.8/site-packages/dask/array/core.py in (.0) 4178 4179 while isinstance(arrays, (list, tuple)): -> 4180 result.append(tuple([shape(deepfirst(a))[dim] for a in arrays])) 4181 arrays = arrays[0] 4182 dim += 1 IndexError: tuple index out of range ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524 https://github.com/pydata/xarray/issues/3698#issuecomment-689825648,https://api.github.com/repos/pydata/xarray/issues/3698,689825648,MDEyOklzc3VlQ29tbWVudDY4OTgyNTY0OA==,2448579,2020-09-09T21:14:16Z,2020-09-09T21:14:16Z,MEMBER,"I guess I can see that. Thanks Tom. > it even when the chunk size exceeds dask.config['array']['chunk-size'] FYI the slicing behaviour is independent of chunk-size (matt's recommendation).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524 https://github.com/pydata/xarray/issues/3698#issuecomment-689808725,https://api.github.com/repos/pydata/xarray/issues/3698,689808725,MDEyOklzc3VlQ29tbWVudDY4OTgwODcyNQ==,1312546,2020-09-09T20:38:39Z,2020-09-09T20:38:39Z,MEMBER,"FYI, @dcherian your recent PR to dask fixed this example. Playing around with chunk sizes, it seems to have fixed it even when the chunk size exceeds `dask.config['array']['chunk-size']`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524 https://github.com/pydata/xarray/issues/3698#issuecomment-592101136,https://api.github.com/repos/pydata/xarray/issues/3698,592101136,MDEyOklzc3VlQ29tbWVudDU5MjEwMTEzNg==,1312546,2020-02-27T18:13:28Z,2020-02-27T18:13:28Z,MEMBER,"It looks like xarray is getting a bad task graph after the optimize. ```python In [1]: import xarray as xr import dask In [2]: import dask In [3]: a = dask.array.ones((10,5), chunks=(1,3)) ...: a = dask.optimize(a)[0] In [4]: da = xr.DataArray(a.compute()).chunk({""dim_0"": 5}) ...: da = dask.optimize(da)[0] In [5]: dict(da.__dask_graph__()) Out[5]: {('xarray--e2865aa10d476e027154771611541f99', 1, 0): (, 'xarray--e2865aa10d476e027154771611541f99', (slice(5, 10, None), slice(0, 5, None))), ('xarray--e2865aa10d476e027154771611541f99', 0, 0): (, 'xarray--e2865aa10d476e027154771611541f99', (slice(0, 5, None), slice(0, 5, None)))} ``` Notice that are references to `xarray--e2865aa10d476e027154771611541f99` (just the string, not a tuple representing a chunk) but that key isn't in the graph. If we manually insert that, you'll see things work ```python In [9]: dsk['xarray--e2865aa10d476e027154771611541f99'] = da._to_temp_dataset()[xr.core.dataarray._THIS_ARRAY] In [11]: dask.get(dsk, keys=[('xarray--e2865aa10d476e027154771611541f99', 1, 0)]) Out[11]: ( (dim_0: 5, dim_1: 5)> dask.array Dimensions without coordinates: dim_0, dim_1,) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,550355524