issues: 473692721
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
473692721 | MDU6SXNzdWU0NzM2OTI3MjE= | 3165 | rolling: bottleneck still not working properly with dask arrays | 13084427 | open | 0 | 10 | 2019-07-28T00:58:10Z | 2021-03-01T20:39:43Z | NONE | MCVE Code Sample```python Your code hereimport numpy as np import xarray as xr from dask.distributed import Clienttemp= xr.DataArray(np.zeros((5000, 50000)),dims=("x","y")).chunk({"y":100, }) temp.rolling(x=100).mean() ``` Expected OutputProblem DescriptionI was thrilled to find that the new release (both 0.12.2 and 0.12.3) fixed the rolling window issue. However, When I tried, it seems the problem is still there. Previously, the above code runs with bottleneck installed. However, with the new version, with or without bottleneck, it simply gives the memory error as below. I have tried to use old and new versions of Dask and pandas, but with no much difference. However, the dask Dataframe version of the code (shown below) runs ok. ```python import dask.dataframe as dd import dask.array as da import numpy as np da_array=da.from_array(np.zeros((5000, 50000)), chunks=(5000,100)) df = dd.from_dask_array(da_array) df.rolling(window=100,axis=0).mean() ``` I have also tried to apply the similar thing on dataset from netcdf files, it simply started consuming very large portion of memory and gives the similar errors. Any help are appreciated.
Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/3165/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |