html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4591#issuecomment-871611079,https://api.github.com/repos/pydata/xarray/issues/4591,871611079,MDEyOklzc3VlQ29tbWVudDg3MTYxMTA3OQ==,463809,2021-06-30T17:53:54Z,2021-06-30T17:53:54Z,CONTRIBUTOR,"I am trying to use `worker_client` that is opening xarrays, submitting further compute, and then saving xarrays. Perhaps somehow related to that?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4591#issuecomment-870152019,https://api.github.com/repos/pydata/xarray/issues/4591,870152019,MDEyOklzc3VlQ29tbWVudDg3MDE1MjAxOQ==,463809,2021-06-29T01:10:30Z,2021-06-29T01:14:58Z,CONTRIBUTOR,"This issue appears to be back in some form, with `engine=zarr`. The code looks like this, using fsspec's mapper API to access Azure blob store: ``` fs = fsspec.filesystem(""az://..."") ds = xr.open_dataset(fs.get_mapper(path), engine=""zarr"", chunks=""auto""): ... ``` I have not tracked down a self-contained reproducer, as it only fails for one call but not others of a similar form. Reporting it while I dig into it further, in case you have any suggestions. ``` [2021-06-29 00:44:47] [2021-06-29 00:44:47 core.py:74 CRITICAL] Failed to Serialize [2021-06-29 00:44:47] Traceback (most recent call last): [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/core.py"", line 70, in dumps [2021-06-29 00:44:47] frames[0] = msgpack.dumps(msg, default=_encode_default, use_bin_type=True) [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/msgpack/__init__.py"", line 35, in packb [2021-06-29 00:44:47] return Packer(**kwargs).pack(o) [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 286, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 292, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 289, in msgpack._cmsgpack.Packer.pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 258, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 258, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 225, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""msgpack/_packer.pyx"", line 279, in msgpack._cmsgpack.Packer._pack [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/core.py"", line 56, in _encode_default [2021-06-29 00:44:47] obj, serializers=serializers, on_error=on_error, context=context [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 422, in serialize_and_split [2021-06-29 00:44:47] header, frames = serialize(x, serializers, on_error, context) [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 256, in serialize [2021-06-29 00:44:47] iterate_collection=True, [2021-06-29 00:44:47] File ""/deps/envs/deps/lib/python3.7/site-packages/distributed/protocol/serialize.py"", line 348, in serialize [2021-06-29 00:44:47] raise TypeError(msg, str(x)[:10000]) [2021-06-29 00:44:47] TypeError: ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyIndexedArray(array=, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))') [2021-06-29 00:44:47] [2021-06-29 00:44:47 utils.py:37 ERROR] ('Could not serialize object of type ImplicitToExplicitIndexingAdapter.', 'ImplicitToExplicitIndexingAdapter(array=CopyOnWriteArray(array=LazilyIndexedArray(array=, key=BasicIndexer((slice(None, None, None), slice(None, None, None))))))') ``` ``` pip list | grep 'dask\|distributed\|xarray\|zarr\|msgpack\|adlfs' adlfs 0.7.7 dask 2021.6.2 distributed 2021.6.2 msgpack 1.0.0 xarray 0.18.2 zarr 2.8.3 ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,745801652 https://github.com/pydata/xarray/issues/4826#issuecomment-766973931,https://api.github.com/repos/pydata/xarray/issues/4826,766973931,MDEyOklzc3VlQ29tbWVudDc2Njk3MzkzMQ==,463809,2021-01-25T17:19:19Z,2021-01-25T17:19:19Z,CONTRIBUTOR,"Tagging a few maintainers: @dcherian @shoyer. Sorry to tag you directly, hope that's ok. I think I've found the issue here and would like to provide a PR to fix, but need some input on what you think would be best. To summarize, the current behavior leading to the bug is: 1. When a `bool` dtype is initially written, the `maybe_encode_bool` function us used to convert it the bool to a `i1` with a `vars.attr` that says it is actually a bool. It would appear from #2937 that it ends up an `i8` somehow anyway. https://github.com/pydata/xarray/blob/cc53a77ff0c8aaf8686f0b0bd7f75985b74e2054/xarray/conventions.py#L119 2. When this is loaded the first time, the `i8` is correctly identified as actually being a bool using the attributes. https://github.com/pydata/xarray/blob/cc53a77ff0c8aaf8686f0b0bd7f75985b74e2054/xarray/conventions.py#L352 3. However, 2 lines above that, there is a `encoding.setdefault(""dtype"", original_dtype)` so this Variable object now has `.encoding[""dtype""]` which is `i8`. 4. When I try to save this again, it tries `maybe_encode_bool` from step 1 again, but this time the function is bypassed because of step 3 above. 5. The dataset I write from step 4 now does not have the attribute identifying it as a bool, and so it's an `i8` when I load it back. I can think of a few fixes: - Drop the `.encoding` dict on load for bools. Presumably these `.encodings` are kept such that datasets attempts to preserve the compressor, chunk size, etc. of its source. However, given that `.encoding` seems to be dropped when I eg. do a `.astype(""bool"")` maybe this is OK for bools. - Set `var.attrs[""dtype""] = np.dtype(""bool"")` on load. This would preserve what we'd get out of `maybe_encode_bool`. The challenge with this is that `.attrs` are not always preserved? - Change `maybe_encode_bool` to not ignore when `.encoding[""dtype""]` exists. I suppose there would be a need to check that the compressor is still compatible, etc? As a local fix while we consider these options, can you confirm that, as the docs state, the `.encoding` is only used for *serializing* and not *deserializing* arrays, and therefore if I drop the `.encoding` on my `DataArrays` as a temporary fix I wouldn't break anything (if I am ok with xarray not preserving my compressors, etc).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789410367 https://github.com/pydata/xarray/issues/4826#issuecomment-765747736,https://api.github.com/repos/pydata/xarray/issues/4826,765747736,MDEyOklzc3VlQ29tbWVudDc2NTc0NzczNg==,463809,2021-01-22T23:37:47Z,2021-01-22T23:38:22Z,CONTRIBUTOR,"OK here's the other side of the problem. The original dtype (which is i8) is set in the encoding: https://github.com/pydata/xarray/blob/v0.16.2/xarray/conventions.py#L350","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789410367 https://github.com/pydata/xarray/issues/4826#issuecomment-765741719,https://api.github.com/repos/pydata/xarray/issues/4826,765741719,MDEyOklzc3VlQ29tbWVudDc2NTc0MTcxOQ==,463809,2021-01-22T23:27:57Z,2021-01-22T23:27:57Z,CONTRIBUTOR,"Apparently my proposed fix broke a bunch of other things, eg. some writing of timedeltas with units and such. Deleting the ""dtype"" key in the `.encoding` of the boolean variable also seems to do the trick. The issue is that bools are not encoded correctly if the `.encoding` field already has a `dtype`: https://github.com/pydata/xarray/blob/v0.16.2/xarray/conventions.py#L119","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789410367 https://github.com/pydata/xarray/pull/3504#issuecomment-552160158,https://api.github.com/repos/pydata/xarray/issues/3504,552160158,MDEyOklzc3VlQ29tbWVudDU1MjE2MDE1OA==,463809,2019-11-10T03:58:53Z,2019-11-10T03:58:53Z,CONTRIBUTOR,"Thanks @max-sixty, I made a small update to the error message. I had added a line to `doc/whats-new.rst` in 2e91693 is that what you were referring to?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,520507183