issue_comments: 417252006
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/2389#issuecomment-417252006 | https://api.github.com/repos/pydata/xarray/issues/2389 | 417252006 | MDEyOklzc3VlQ29tbWVudDQxNzI1MjAwNg== | 1882397 | 2018-08-30T09:23:20Z | 2018-08-30T09:48:40Z | NONE | It seems the xarray object that is sent to the workers contains a reference to the complete graph: ```python vals = da.random.random((5, 1), chunks=(1, 1)) ds = xr.Dataset({'vals': (['a', 'b'], vals)}) write = ds.to_netcdf('file2.nc', compute=False) key = [val for val in write.dask.keys() if isinstance(val, str) and val.startswith('NetCDF')][0] wrapper = write.dask[key] len(pickle.dumps(wrapper)) 14652delayed_store = wrapper.datastore.delayed_store len(pickle.dumps(delayed_store)) 14652dask.visualize(delayed_store) ``` The size jumps to the 1.3MB if I use 500 chunks again. The warning about the large object in the graph disappears if we delete that reference before we execute the graph:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
355264812 |