issue_comments: 257292279
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/798#issuecomment-257292279 | https://api.github.com/repos/pydata/xarray/issues/798 | 257292279 | MDEyOklzc3VlQ29tbWVudDI1NzI5MjI3OQ== | 306380 | 2016-10-31T13:24:01Z | 2016-10-31T14:49:31Z | MEMBER | I may have a solution to this in https://github.com/dask/distributed/pull/606, which allows for custom serialization formats to be registered with dask.distributed. We would register serialize and deserialize functions for the various netCDF objects. Something like the following might work for h5py: ``` python def serialize_dataset(dset): header = {} frames = [dset.filename.encode(), dset.datapath.encode()] return header, frames def deserialize_dataset(header, frames): filename, datapath = frames f = h5py.File(filename.decode()) dest = f[datapath.decode()] return dset register_serialization(h5py.Dataset, serialize_dataset, deserialize_dataset) ``` We still have lingering open files but not too many per machine. They'll move around the network, but only as necessary. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
142498006 |