home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 341329662

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1464#issuecomment-341329662 https://api.github.com/repos/pydata/xarray/issues/1464 341329662 MDEyOklzc3VlQ29tbWVudDM0MTMyOTY2Mg== 1217238 2017-11-02T06:29:38Z 2017-11-02T06:29:38Z MEMBER

I did a little bit of digging here, using @mrocklin's Client(processes=False) trick.

The problem seems to be that the arrays that we add to the writer in AbstractWritableDataStore.set_variables are not pickleable. To be more concrete, consider these lines: https://github.com/pydata/xarray/blob/f83361c76b6aa8cdba8923080bb6b98560cf3a96/xarray/backends/common.py#L221-L232

target is currently a netCDF4.Variable object (or whatever the appropriate backend type is). Anything added to the writer eventually ends up as an argument to dask.array.store and hence gets put into the dask graph. When dask-distributed tries to pickle the dask graph, it fails on the netCDF4.Variable.

What we need to instead is wrap these target arrays in appropriate array wrappers, e.g., NetCDF4ArrayWrapper, adding __setitem__ methods to the array wrappers if needed. Unlike most backend array types, our array wrappers are pickleable, which is essentially for use with dask-distributed.

If anyone's curious, here's the traceback and code I used to debug this: https://gist.github.com/shoyer/4564971a4d030cd43bba8241d3b36c73

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  238284894
Powered by Datasette · Queries took 0.635ms · About: xarray-datasette