html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4591#issuecomment-871611079,https://api.github.com/repos/pydata/xarray/issues/4591,871611079,MDEyOklzc3VlQ29tbWVudDg3MTYxMTA3OQ==,463809,2021-06-30T17:53:54Z,2021-06-30T17:53:54Z,CONTRIBUTOR,"I am trying to use `worker_client` that is opening xarrays, submitting further compute, and then saving xarrays. Perhaps somehow related to that?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-870777725,https://api.github.com/repos/pydata/xarray/issues/4591,870777725,MDEyOklzc3VlQ29tbWVudDg3MDc3NzcyNQ==,6042212,2021-06-29T17:20:43Z,2021-06-29T17:20:43Z,CONTRIBUTOR,"I only have vague thoughts. To be sure: you can pickle the file-system, any mapper (`.get_mapper()`) and any open file (`.open()`), right? The question here is, why msgpack is being invoked. Those items, as well as any internal xarray stuff should only be in tasks, and so pickled. Is there a high-level-graph layer encapsulating things that were previously pickled? The only things that appear in any HLG-layer should be the paths and storage options needed to open a file-system, not the file-system itself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-870152019,https://api.github.com/repos/pydata/xarray/issues/4591,870152019,MDEyOklzc3VlQ29tbWVudDg3MDE1MjAxOQ==,463809,2021-06-29T01:10:30Z,2021-06-29T01:14:58Z,CONTRIBUTOR,"This issue appears to be back in some form, with `engine=zarr`. The code looks like this, using fsspec's mapper API to access Azure blob store: ``` fs = fsspec.filesystem(""az://..."") ds = xr.open_dataset(fs.get_mapper(path), engine=""zarr"", chunks=""auto""): ... ``` I have not tracked down a self-contained reproducer, as it only fails for one call but not others of a similar form. Reporting it while I dig into it further, in case you have any suggestions. ``` [2021-06-29 00:44:47] [2021-06-29 00:44:47 core.py:74 CRITICAL] Failed to Serialize [2021-06-29 00:44:47] Traceback (most recent call last): [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/core.py"", line 70, in dumps [2021-06-29 00:44:47] frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True) [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/msgpack/__init__.py"", line 35, in packb [2021-06-29 00:44:47] return Packer(**kwargs).pack(o) [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 286, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 292, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 289, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 258, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 258, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 279, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/core.py"", line 56, in _encode_default [2021-06-29 00:44:47] obj, serializers=serializers, on_error=on_error, context=context [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 422, in serialize_and_split [2021-06-29 00:44:47] header, frames = serialize(x, serializers, on_error, context) [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 256, in serialize [2021-06-29 00:44:47] iterate_collection=True, [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 348, in serialize [2021-06-29 00:44:47] raise TypeError(msg, str(x)[:10000]) [2021-06-29 00:44:47] TypeError: ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyIndexedArray(array=, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))') [2021-06-29 00:44:47] [2021-06-29 00:44:47 utils.py:37 ERROR] ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyIndexedArray(array=, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))') ``` ``` pip list | grep 'dask\|distributed\|xarray\|zarr\|msgpack\|adlfs' adlfs 0.7.7 dask 2021.6.2 distributed 2021.6.2 msgpack 1.0.0 xarray 0.18.2 zarr 2.8.3 ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-730467523,https://api.github.com/repos/pydata/xarray/issues/4591,730467523,MDEyOklzc3VlQ29tbWVudDczMDQ2NzUyMw==,1197350,2020-11-19T15:54:38Z,2020-11-19T15:54:38Z,MEMBER,"This is fixed by intake/filesystem_spec#477. However, the existence of this issue points to the need for more ecosystem-wide integration testing of xarray / dask / zarr / fsspec. I know we discussed this is on some other issue, but I can't find it...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-730031761,https://api.github.com/repos/pydata/xarray/issues/4591,730031761,MDEyOklzc3VlQ29tbWVudDczMDAzMTc2MQ==,1217238,2020-11-18T23:56:17Z,2020-11-18T23:56:17Z,MEMBER,"OK, I think I understand what's going on. Xarray serializes arguments that should suffice to recreate/open a backend-specific file object (e.g., `h5netcdf.File`). So if you pass in a file name to `open_dataset()`, that works fine. But if you pass in a file-like object (as is done here with `fsspec`) the file-like object needs to be serializable.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729863863,https://api.github.com/repos/pydata/xarray/issues/4591,729863863,MDEyOklzc3VlQ29tbWVudDcyOTg2Mzg2Mw==,1197350,2020-11-18T18:15:16Z,2020-11-18T18:15:16Z,MEMBER,Thanks for your quick response to this Martin!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729863434,https://api.github.com/repos/pydata/xarray/issues/4591,729863434,MDEyOklzc3VlQ29tbWVudDcyOTg2MzQzNA==,6042212,2020-11-18T18:14:28Z,2020-11-18T18:14:28Z,CONTRIBUTOR,"The `xarray.backends.h5netcdf_.H5NetCDFArrayWrapper` seems to keep a reference to the open file, which for HTTP contains the open session. The linked PR fixes the serialization of those files, for the HTTP case.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729837649,https://api.github.com/repos/pydata/xarray/issues/4591,729837649,MDEyOklzc3VlQ29tbWVudDcyOTgzNzY0OQ==,1217238,2020-11-18T17:37:58Z,2020-11-18T17:37:58Z,MEMBER,`H5NetCDFArrayWrapper` is definitely supposed to be serializable with dask -- that's one of main reasons why these array wrapper classes exist in the first place.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729803257,https://api.github.com/repos/pydata/xarray/issues/4591,729803257,MDEyOklzc3VlQ29tbWVudDcyOTgwMzI1Nw==,6042212,2020-11-18T16:42:30Z,2020-11-18T16:42:30Z,CONTRIBUTOR,"OK, I can see a thing after all... please stand by","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729796223,https://api.github.com/repos/pydata/xarray/issues/4591,729796223,MDEyOklzc3VlQ29tbWVudDcyOTc5NjIyMw==,1197350,2020-11-18T16:31:14Z,2020-11-18T16:31:14Z,MEMBER,Can you figure out how the http version differs from the gcs version? That might hold a clue.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729795030,https://api.github.com/repos/pydata/xarray/issues/4591,729795030,MDEyOklzc3VlQ29tbWVudDcyOTc5NTAzMA==,6042212,2020-11-18T16:29:18Z,2020-11-18T16:29:18Z,CONTRIBUTOR,"I don't think it's fsspec, the HTTPFileSystem and file objects are known to serialise. However ``` >>> distributed.protocol.serialize(dsc.surface.mean().data.dask['open_dataset-27832a1f850736a8d9a11a882ad06230surface-3b6f5b6a90c2cfa65379d3bfae22126f']) ({'serializer': 'error'}, ...) ``` (that's one of the keys I picked from the graph at random, your keys may differ) I can't say why this object is in the graph where perhaps it wasn't before, but it has a reference to a ""CopyOnWriteArray"", which sounds like a buffer owned by something else and probably the non-serializable part. Digging find a contained """" which is not serializable - so maybe xarray can do something about this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-729793908,https://api.github.com/repos/pydata/xarray/issues/4591,729793908,MDEyOklzc3VlQ29tbWVudDcyOTc5MzkwOA==,1197350,2020-11-18T16:27:30Z,2020-11-18T16:27:30Z,MEMBER,"I finally found a permutation that works, which makes me think this is an fsspec error. ```python import gcsfs gcs = gcsfs.GCSFileSystem() url = 'gs://ldeo-glaciology/bedmachine/BedMachineAntarctica_2019-11-05_v01.nc' openfile = gcs.open(url, mode='rb') dsgcs = xr.open_dataset(openfile, chunks=3000) dsgcs.surface.mean().compute() ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652