issues: 481866516
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
481866516 | MDU6SXNzdWU0ODE4NjY1MTY= | 3225 | xr.DataArray.where sets valid points to nan when using several dask chunks | 20225454 | closed | 0 | 3 | 2019-08-17T09:09:59Z | 2022-04-18T15:58:40Z | 2022-04-18T15:58:40Z | NONE | MCVE Code SampleI am trying to randomly delete a fraction of a xr.DataArray (see identical StackOverflow question) and subsequently access only the values from the original dataset data that were deleted. This works fine as long as the data is not stored in dask arrays or in only one dask array. As soon as I define chunks smaller than the total size of the data, the original values are set to nan. ```python data = xr.DataArray(np.arange(555.).reshape(5,5,5), dims=('time','latitude','longitude')) data.to_netcdf('/path/to/file.nc') data = xr.open_dataarray('/path/to/file.nc', chunks={'time':5}) # produces expected outputdata = xr.open_dataarray('/path/to/file.nc', chunks={'time':2}) # produces observed output def set_fraction_randomly_to_nan(data, frac_missing): np.random.seed(0) data[np.random.rand(*data.shape) < frac_missing] = np.nan return data data_lost = xr.apply_ufunc(set_fraction_randomly_to_nan, data.copy(deep=True), output_core_dims=[['latitude','longitude']], dask='parallelized', input_core_dims=[['latitude','longitude']], output_dtypes=[data.dtype], kwargs={'frac_missing': 0.5}) print(data[0,-4:,-4:].values) >>[[ 6. 7. 8. 9.][11. 12. 13. 14.][16. 17. 18. 19.][21. 22. 23. 24.]]print(data.where(np.isnan(data_lost),0)[0,-4:,-4:].values) ``` Expected Outputexpected output of the last line: keep all values where
Problem Descriptionobserved output of the last line: set all values where
Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3225/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |