issue_comments: 257292279

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/798#issuecomment-257292279	https://api.github.com/repos/pydata/xarray/issues/798	257292279	MDEyOklzc3VlQ29tbWVudDI1NzI5MjI3OQ==	306380	2016-10-31T13:24:01Z	2016-10-31T14:49:31Z	MEMBER	I may have a solution to this in https://github.com/dask/distributed/pull/606, which allows for custom serialization formats to be registered with dask.distributed. We would register serialize and deserialize functions for the various netCDF objects. Something like the following might work for h5py: ``` python def serialize_dataset(dset): header = {} frames = [dset.filename.encode(), dset.datapath.encode()] return header, frames def deserialize_dataset(header, frames): filename, datapath = frames f = h5py.File(filename.decode()) dest = f[datapath.decode()] return dset register_serialization(h5py.Dataset, serialize_dataset, deserialize_dataset) ``` We still have lingering open files but not too many per machine. They'll move around the network, but only as necessary.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		142498006