html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5739#issuecomment-1527461082,https://api.github.com/repos/pydata/xarray/issues/5739,1527461082,IC_kwDOAMm_X85bCzTa,5821660,2023-04-28T12:00:15Z,2023-04-28T12:00:15Z,MEMBER,"@dougrichardson Sorry for the delay. If you are still interested in the source of this issue here is what I found:
The root cause is different `scale_factor` and `add_offset` in the source files.
When merging only the `.encoding` of the first dataset survives. This leads to wrongly encoded file for the may-dates. But why is this so?
The issue is with the packed dtype (""int16"") and the particular values of `scale_factor`/`add_offset`.
For feb the dynamic range is (228.96394336525748, 309.9690856933594) K whereas for may it is
(205.7644192729947, 311.7797088623047) K.
Now we can clearly see that all our values which are above 309.969 K will be folded to the lower end (>229 K).
To circumvent that you have at least two options:
- change `scale_factor` and `add_offset` values in the variables `.encoding` before writing to appropriate values which cover your whole dynamic range
- drop `scale_factor`/`add_offset` (and other CF related attributes) from .encoding to write floating point values
It might be nice to have checks for that in the encoding steps, to prevent writing erroneous values. So this is not really a bug, but might be less impactful when encoding is dropped on operations (see discussion in #6323).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,979916914