issue_comments: 417252006
This data as json
| html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| https://github.com/pydata/xarray/issues/2389#issuecomment-417252006 | https://api.github.com/repos/pydata/xarray/issues/2389 | 417252006 | MDEyOklzc3VlQ29tbWVudDQxNzI1MjAwNg== | 1882397 | 2018-08-30T09:23:20Z | 2018-08-30T09:48:40Z | NONE | It seems the xarray object that is sent to the workers contains a reference to the complete graph: ```python vals = da.random.random((5, 1), chunks=(1, 1)) ds = xr.Dataset({'vals': (['a', 'b'], vals)}) write = ds.to_netcdf('file2.nc', compute=False) key = [val for val in write.dask.keys() if isinstance(val, str) and val.startswith('NetCDF')][0] wrapper = write.dask[key] len(pickle.dumps(wrapper)) 14652delayed_store = wrapper.datastore.delayed_store len(pickle.dumps(delayed_store)) 14652dask.visualize(delayed_store) ```
The size jumps to the 1.3MB if I use 500 chunks again. The warning about the large object in the graph disappears if we delete that reference before we execute the graph:
|
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
355264812 |
