html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/6812#issuecomment-1269235342,https://api.github.com/repos/pydata/xarray/issues/6812,1269235342,IC_kwDOAMm_X85Lpv6O,145117,2022-10-06T02:48:22Z,2022-10-06T02:48:22Z,CONTRIBUTOR,"A bit more detail about the existing tests that don't match the CF spec. Per the spec, `scale_factor` and `add_offset` should be of the same type. That causes tests throughout https://github.com/pydata/xarray/blob/main/xarray/tests/test_coding.py and https://github.com/pydata/xarray/blob/main/xarray/tests/test_backends.py to fail, because: https://github.com/pydata/xarray/blob/13c52b27b777709fc3316cf4334157f50904c02b/xarray/tests/test_coding.py#L112-L113 There is 1 test in `test_coding`, and 9 tests in `test_backends` that use mixed types. That's a tractable number I can fix. In addition, the expected `dtype` returned by many of the tests does not match (my interpretation of) the expected `dtype` per the CF spec. I am concerned that this is a significant change and I'm not sure what the process is for making this change. I would like to have some idea, even if not a guarantee, that it would be welcomed and accepted before doing all the work. I note that a recent other large PR to try to fix cf decoding has also stalled, and I'm not sure why (see #2751)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1309966595 https://github.com/pydata/xarray/pull/6812#issuecomment-1266136366,https://api.github.com/repos/pydata/xarray/issues/6812,1266136366,IC_kwDOAMm_X85Ld7Uu,145117,2022-10-03T22:29:28Z,2022-10-03T22:29:28Z,CONTRIBUTOR,"Hi @dcherian - I dropped this because I went down a rabbit hole that seemed very very deep. Xarray has written 10s (100s?) of tests that touch this decoding function that make assumptions that I believe are incorrect after a careful reading of the CF spec. I believe the path forward will take some conversation before coding, so perhaps this should be moved to an issue rather than a pull request? A big decision is if the decode option strictly follows CF guidelines. If so, then a lot of tests need to be changed (for example, to follow the simple rule of `[scale_factor and add_offset] must both be of type float or both be of type double`). Enforcing this would probably break `xarray` backward compatibility for writing files. I assume that that may be OK and there are processes to handle this (start with 'deprecation' warnings, then eventually throw errors?). There are also likely many NetCDF files that are not standard compliant and we need to decide how to read them. Furthermore, the CF conventions are themselves not very clear, and possibly ambiguous. I started a conversation here: https://github.com/cf-convention/cf-conventions/issues/374 on this, but that is also unresolved at the moment. The CF convention mentions `int` and `float`, but not how many bytes those are. What happens when a files is written & packed on one architecture and read & unpacked on another? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1309966595 https://github.com/pydata/xarray/pull/6812#issuecomment-1189512813,https://api.github.com/repos/pydata/xarray/issues/6812,1189512813,IC_kwDOAMm_X85G5oZt,145117,2022-07-19T20:19:29Z,2022-07-19T20:19:29Z,CONTRIBUTOR,"I'm reading more in https://github.com/pydata/xarray/blob/2a5686c6fe855502523e495e43bd381d14191c7b/xarray/coding/variables.py and I'm confused about some logic: https://github.com/pydata/xarray/blob/2a5686c6fe855502523e495e43bd381d14191c7b/xarray/coding/variables.py#L271-L272 `pop_to` does a `pop` operation - it removes the key/value pair. So line 1 above will remove `add_offset` from `attrs` if it exists. The second line then checks for `""add_offset"" in attrs` which should always be False. I think this is happening based on inspecting with the debugger. Furthermore, the fix I implemented in this Pull Request which returns `np.float64` fixes my bug, but only because this bug exists. My dataset **has add_offset**, so the lines I changed: ```python if not has_offset: return np.float64 ``` should not run, but do run because of this issue. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1309966595 https://github.com/pydata/xarray/pull/6812#issuecomment-1189485451,https://api.github.com/repos/pydata/xarray/issues/6812,1189485451,IC_kwDOAMm_X85G5huL,145117,2022-07-19T19:46:23Z,2022-07-19T19:46:23Z,CONTRIBUTOR,"Note - I also have not run the ""Running the performance test suite"" code in https://xarray.pydata.org/en/stable/contributing.html - I assume changing from `float32` to `float64` would impact performance. I can run that if suggested.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1309966595