issue_comments: 573196874

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/3668#issuecomment-573196874	https://api.github.com/repos/pydata/xarray/issues/3668	573196874	MDEyOklzc3VlQ29tbWVudDU3MzE5Njg3NA==	2443309	2020-01-10T20:40:14Z	2020-01-10T20:40:14Z	MEMBER	The scenario you are describing--trying to open a file that is not accessible at all from the client--is certainly not something we ever considered when designing this. It is a miracle to me that it does work with netCDF. True. I think its fair to say that the behavior you are enjoying (accessing data that the client cannot see) is the exception, not the rule. I expect there are many places in our backends that will not support this functionality at present. The motivation for implementing the `parallel` feature was simply to shard the fileIO time when opening large collections (>10k) of netcdf files. Ironically, this dask issue also popped up and has some significant overlap here: https://github.com/dask/dask/issues/5769 In both of these cases, the desire is for the worker to open the file (or zarr dataset), construct the underlying dask arrays, and return the meta object. This requires the object to be fully pickle-able and for any references to be maintained. It is possible, as indicated by your traceback, that the zarr backend is trying to reference the `zgroup` file and its not there. The logical place to start would be to look into why we can't pickle xarray datasets that come from zarr stores.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		546562676