issues
1 row where state = "closed", type = "issue" and user = 9040990 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date), closed_at (date)
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at ▲ | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 435311415 | MDU6SXNzdWU0MzUzMTE0MTU= | 2908 | More efficient rolling with large dask arrays | emaroon 9040990 | closed | 0 | 3 | 2019-04-19T21:35:59Z | 2019-10-04T17:04:37Z | 2019-10-04T17:04:37Z | NONE | Code Sample```python import xarray as xr import dask.array as da dsize=[62,12,100,192,288] array1=da.random.random(dsize,chunks=(dsize[0],dsize[1],1,dsize[3],int(dsize[4]/2))) array2=xr.DataArray(array1) rollingmean=array2.rolling(dim_1=3,center=True).mean() # <-- this kills all workers ``` Problem descriptionI'm working on NCAR's cheyenne with a 36GB netcdf using dask_jobqueue.PBSCluster, and trying to calculate the running-mean along one dimension. Despite having plenty of memory reserved (400GB), I can watch DataArray.rolling blow up the bytes stored in the dashboard until the job hangs and all the workers are killed. The above snippet reproduces the issue with the same array size and chunksize as what I'm working with. This worker-killing behavior does not occur for arrays that are 100x smaller. I've found a speedy way to calculate what I need without using rolling, but I thought I should bring this to your attention regardless. In case it's relevant, here's how I'm setting up the dask cluster on cheyenne: ```python from dask.distributed import Client from dask_jobqueue import PBSCluster #version 0.4.1 cluster=PBSCluster(cores=36, processes=9, memory='109GB', project=myproj, resource_spec='select=1:ncpus=36:mem=109G', queue='regular', walltime='02:00:00') numnodes=4 client = Client(cluster) cluster.scale(numnodes*9) ``` Output of  | {
    "url": "https://api.github.com/repos/pydata/xarray/issues/2908/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
} | completed | xarray 13221727 | issue | 
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);