html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/5065#issuecomment-826913149,https://api.github.com/repos/pydata/xarray/issues/5065,826913149,MDEyOklzc3VlQ29tbWVudDgyNjkxMzE0OQ==,1197350,2021-04-26T15:08:43Z,2021-04-26T15:08:43Z,MEMBER,I think this PR has received a very thorough review. I would be pleased if someone from @pydata/xarray would merge it soon.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-826888674,https://api.github.com/repos/pydata/xarray/issues/5065,826888674,MDEyOklzc3VlQ29tbWVudDgyNjg4ODY3NA==,1197350,2021-04-26T14:38:49Z,2021-04-26T14:38:49Z,MEMBER,"The pre-commit workflow is raising a blackdoc error I am not seeing in my local env ```diff diff --git a/doc/internals/duck-arrays-integration.rst b/doc/internals/duck-arrays-integration.rst index eb5c4d8..2bc3c1f 100644 --- a/doc/internals/duck-arrays-integration.rst +++ b/doc/internals/duck-arrays-integration.rst @@ -25,7 +25,7 @@ argument: ... def _repr_inline_(self, max_width): - """""" format to a single line with at most max_width characters """""" + """"""format to a single line with at most max_width characters"""""" ... ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-817990859,https://api.github.com/repos/pydata/xarray/issues/5065,817990859,MDEyOklzc3VlQ29tbWVudDgxNzk5MDg1OQ==,1197350,2021-04-12T17:27:28Z,2021-04-12T17:27:28Z,MEMBER,Any further feedback on this now reduced-scope PR? Merging this would be helpful for moving forward Pangeo forge.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-815019613,https://api.github.com/repos/pydata/xarray/issues/5065,815019613,MDEyOklzc3VlQ29tbWVudDgxNTAxOTYxMw==,1197350,2021-04-07T15:44:25Z,2021-04-07T15:44:25Z,MEMBER,"I have removed the controversial `encoding['chunks']` stuff from the PR. Now it only contains the `safe_chunks` option in `to_zarr`. If there are no further comments on this, I think this is good to go.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-814102743,https://api.github.com/repos/pydata/xarray/issues/5065,814102743,MDEyOklzc3VlQ29tbWVudDgxNDEwMjc0Mw==,1197350,2021-04-06T13:03:53Z,2021-04-06T13:03:53Z,MEMBER,"We seem to be unable to resolve the complexities around chunk encoding. I propose to remove this from the PR and reduce the scope to just the `safe_chunks` features. @aurghs should probably be the one to tackle the chunk encoding problem; unfortunately it exceeds my understanding, and I don't have time to dig deeper at the moment. In the meantime `safe_chunks` is important for pangeo-forge forward progress. Please give a 👍 or 👎 to this idea if you have an opinion.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811975731,https://api.github.com/repos/pydata/xarray/issues/5065,811975731,MDEyOklzc3VlQ29tbWVudDgxMTk3NTczMQ==,1197350,2021-04-01T15:12:15Z,2021-04-01T15:12:15Z,MEMBER,"> But it seems to me that having two different definitions of chunks (dask one and encoded one), is not very intuitive and it's not easy to define a clear default in writing. My use for `encoding.chunks` is to tell Zarr what chunks to use on disk. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811308284,https://api.github.com/repos/pydata/xarray/issues/5065,811308284,MDEyOklzc3VlQ29tbWVudDgxMTMwODI4NA==,1197350,2021-03-31T18:23:03Z,2021-03-31T18:23:03Z,MEMBER,So any ideas how to proceed? 🧐,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811275436,https://api.github.com/repos/pydata/xarray/issues/5065,811275436,MDEyOklzc3VlQ29tbWVudDgxMTI3NTQzNg==,1197350,2021-03-31T17:31:53Z,2021-03-31T17:32:12Z,MEMBER,"A just pushed a new commit which deletes all encoding inside `variable.chunk()`. But as you will see when the CI finishes, this leads to a lot of test failures. For example: ``` =============================================================================== FAILURES ================================================================================ ____________________________________________________ TestNetCDF4ViaDaskData.test_roundtrip_string_encoded_characters ____________________________________________________ self = def test_roundtrip_string_encoded_characters(self): expected = Dataset({""x"": (""t"", [""ab"", ""cdef""])}) expected[""x""].encoding[""dtype""] = ""S1"" with self.roundtrip(expected) as actual: assert_identical(expected, actual) > assert actual[""x""].encoding[""_Encoding""] == ""utf-8"" E KeyError: '_Encoding' /Users/rpa/Code/xarray/xarray/tests/test_backends.py:485: KeyError ``` Why is `chunk` getting called here? Does it actually get called every time we load a dataset with chunks? If so, we will need a more sophisticated solution.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811265134,https://api.github.com/repos/pydata/xarray/issues/5065,811265134,MDEyOklzc3VlQ29tbWVudDgxMTI2NTEzNA==,1197350,2021-03-31T17:17:07Z,2021-03-31T17:17:07Z,MEMBER,"> Replace `self._encoding` with `None` here? Thanks! Yeah that's what I had in mind. But I was wondering if there was an example of doing that it else I could copy. In any case, I'll give it a try now.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811189539,https://api.github.com/repos/pydata/xarray/issues/5065,811189539,MDEyOklzc3VlQ29tbWVudDgxMTE4OTUzOQ==,1197350,2021-03-31T16:12:13Z,2021-03-31T16:12:23Z,MEMBER,"In today's dev call, we proposed to handle encoding in `chunk` the same way we handle it in indexing: by deleting all encoding. The problem is, I can't figure out where this happens. Can someone point me to the place in the code where indexing operations delete encoding? A related question: I discovered this encoding option `preferred_chunks`, which is treated specially: https://github.com/pydata/xarray/blob/57a4479fcd3ebc579cf00e0d6bf85007eda44b56/xarray/core/dataset.py#L396 Should the Zarr backend be setting this? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-811148122,https://api.github.com/repos/pydata/xarray/issues/5065,811148122,MDEyOklzc3VlQ29tbWVudDgxMTE0ODEyMg==,1197350,2021-03-31T15:16:37Z,2021-03-31T15:16:37Z,MEMBER,"I appreciate the discussion on this PR. Does anyone have a concrete suggestion of what to do? If we are not in agreement about the encoding stuff, perhaps I should remove that and just move forward with the `safe_chunks` part of this PR?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-807128780,https://api.github.com/repos/pydata/xarray/issues/5065,807128780,MDEyOklzc3VlQ29tbWVudDgwNzEyODc4MA==,1197350,2021-03-25T17:19:15Z,2021-03-25T17:19:15Z,MEMBER,"> Perhaps a kwarg in `to_zarr` like `ignore_encoding_chunks`? I would argue that this is unnecessary. If you want to explicitly drop encoding, just `del da.encoding['chunks']` before writing. But most users don't figure out that they should do this, because the default behavior is counterintuitive. The problem here is with the default behavior of propagating chunk encoding through computations when it no longer makes sense. My example with the `dtype` encoding illustrates that we already drop encoding on certain operations, so it's not unprecedented. It's more of an implementation question: where and how to do the dropping. FWIW, I would also favor dropping `encoding['chunks']` after indexing, coarsening, interpolating, etc. Basically anything that changes the array shape or chunk structure.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-806724345,https://api.github.com/repos/pydata/xarray/issues/5065,806724345,MDEyOklzc3VlQ29tbWVudDgwNjcyNDM0NQ==,1197350,2021-03-25T13:17:03Z,2021-03-25T13:17:59Z,MEMBER,"I see your point. I guess I don't fully understand where else in the code path encoding gets dropped. Consider this example ```python import xarray as xr ds = xr.Dataset({'foo': ('time', [1, 1], {'dtype': 'int16'})}) ds = xr.decode_cf(ds).compute() assert ""dtype"" in ds.foo.encoding assert ""dtype"" not in (0.5 * ds.foo).encoding ``` Xarray knows to drop the `dtype` encoding after an arithmetic operation. How does that work? To me `.chunk` feel like a similar case: an operation that invalidates any existing encoding.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-804050169,https://api.github.com/repos/pydata/xarray/issues/5065,804050169,MDEyOklzc3VlQ29tbWVudDgwNDA1MDE2OQ==,1197350,2021-03-22T13:12:45Z,2021-03-22T13:12:45Z,MEMBER,"Thanks Anderson. Fixed by rebasing. Now RTD build is failing, but there is no obvious error in the logs...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943 https://github.com/pydata/xarray/pull/5065#issuecomment-803712024,https://api.github.com/repos/pydata/xarray/issues/5065,803712024,MDEyOklzc3VlQ29tbWVudDgwMzcxMjAyNA==,1197350,2021-03-22T01:58:23Z,2021-03-22T02:02:00Z,MEMBER,"Confused about the test error. It seems unrelated. In `test_sparse.py:test_variable_method` ``` E TypeError: no implementation found for 'numpy.allclose' on types that implement __array_function__: [, ] ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,837243943