id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1315111684,I_kwDOAMm_X85OYwME,6816,pandas.errors.InvalidIndexError is raised in some runs when using chunks and map_blocks(),691772,closed,0,,,5,2022-07-22T14:56:41Z,2022-09-13T09:39:48Z,2022-08-19T14:06:09Z,CONTRIBUTOR,,,,"### What is your issue? I'm doing a lengthy computation, which involves hundreds of GB of data using chunks and map_blocks() so that things fit into RAM and can be done in parallel. From time to time, the following error is raised: `pandas.errors.InvalidIndexError: Reindexing only valid with uniquely valued Index objects` The line where this takes place looks pretty harmless: x = a * b.sel(c=d.c) It's a line inside the function `func` which is passed to a `map_blocks()` call. In this case `a` and `b` are `xr.DataArray` or `xr.DataSet` objects shadowed from outer scope and `d` is the parameter `obj` for `map_blocks()`. That means, the line below in the traceback looks like this: xr.map_blocks( lambda d: worker(d).compute().chunk({""time"": None}), d, template=template) I guess it's some kind of race condition, since it's not 100% reproducible, but I have no idea how to further investigate the issue to create a proper bug report or fix my code. Do you have any hint how I could continue building a minimal example or so in such a case? What does the error message want to tell me?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6816/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 438389323,MDU6SXNzdWU0MzgzODkzMjM=,2928,"Dask outputs warning: ""The da.atop function has moved to da.blockwise""",691772,closed,0,,,4,2019-04-29T15:59:31Z,2019-07-12T15:56:29Z,2019-07-12T15:56:28Z,CONTRIBUTOR,,,,"#### Problem description [dask 1.1.0](https://github.com/dask/dask/pull/4348) moved `atop()` to `blockwise()` and introduced a warning when `atop()` is used. #### Related * upstream ticket and PR of dask change: dask/dask#4348 dask/dask#4035 * the warning in the [dask documentation](https://examples.dask.org/xarray.html#Custom-workflows-and-automatic-parallelization) in an xarray example, probably not on purpose * warnings have been already discussed in #2727, but not fixed there * same issue in a different project: pytroll/satpy#608 #### Code Sample ```python import numpy as np import xarray as xr xr.DataArray(np.ones(1000)) d = xr.DataArray(np.ones(1000)) d.to_netcdf('/tmp/ones.nc') d = xr.open_dataarray('/tmp/ones.nc', chunks=10) xr.apply_ufunc(lambda x: 42 * x, d, dask='parallelized', output_dtypes=[np.float64]) ``` This outputs the warning: ``` ...lib/python3.7/site-packages/dask/array/blockwise.py:204: UserWarning: The da.atop function has moved to da.blockwise warnings.warn(""The da.atop function has moved to da.blockwise"") ``` #### Expected Output No warning. As user of a recent version of dask and xarray, there shouldn't be any warnings if everything is done right. The warning should be tackled inside xarray somehow. #### Solution Not sure, can xarray break compatibility with dask <1.1.0 with some future version? Otherwise I guess there needs to be some legacy code in xarray which calls the right function. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.18.0-17-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.3 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.2.0 distributed: 1.27.0 matplotlib: 3.0.3 cartopy: None seaborn: 0.9.0 setuptools: 41.0.0 pip: 19.1 conda: None pytest: 4.4.1 IPython: 7.5.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2928/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue