html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/7039#issuecomment-1460185069,https://api.github.com/repos/pydata/xarray/issues/7039,1460185069,IC_kwDOAMm_X85XCKft,1197350,2023-03-08T13:51:06Z,2023-03-08T13:51:06Z,MEMBER,"Rather than using the scale_factor and add_offset approach, I would look into [xbitinfo](https://xbitinfo.readthedocs.io/) if you want to optimize your compression.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248302788,https://api.github.com/repos/pydata/xarray/issues/7039,1248302788,IC_kwDOAMm_X85KZ5bE,1197350,2022-09-15T16:02:17Z,2022-09-15T16:02:17Z,MEMBER,"> I am curious as to what exactly from the encoding introduces the noise (I still need to read through the documentation more thoroughly)?
The encoding says that your data should be encoded according to the following pseudocode formula:
```
encoded = int((original - offset) / scale_factor)
decoded = (scale_factor * float(encoded)) + offset
```
So the floating-point data are converted back and forth to a less precise type (integer) in order to save space. These numerical operations cannot preserve exact floating point accuracy. That's just how numerical float-point operations work. If you skip the encoding, then you just write the floating point bytes directly to disk, with no loss of precision.
This sort of encoding a crude form of lossy compression that is still unfortunately in use, even though there are much better algorithms available (and built into netcdf and zarr). Differences on the order of 10^-14 should not affect any real-world calculations.
However, this seems like a much, much smaller difference than the problem you originally reported. This suggests that the MRE does not actually reproduce the bug after all. How was the plot above (https://github.com/pydata/xarray/issues/7039#issue-1373352524) generated? From your actual MRE code? Or from your earlier example with real data?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248241823,https://api.github.com/repos/pydata/xarray/issues/7039,1248241823,IC_kwDOAMm_X85KZqif,1197350,2022-09-15T15:12:34Z,2022-09-15T15:12:34Z,MEMBER,"I'm puzzled that I was not able to reproduce this error. I modified the end slightly as follows
```python
# save dataset as netcdf
ds.to_netcdf(""test.nc"")
# load saved dataset
ds_test = xr.open_dataset('test.nc')
# verify that the two are equal within numerical precision
xr.testing.assert_allclose(ds, ds_test)
# plot
plt.plot(ds.t2m - ds_test.t2m)
```
In my case, the differences were just numerical noise (order 10^-14)

I used the [binder environment](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb) for this.
I'm pretty stumped.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248098918,https://api.github.com/repos/pydata/xarray/issues/7039,1248098918,IC_kwDOAMm_X85KZHpm,1197350,2022-09-15T13:25:11Z,2022-09-15T13:25:11Z,MEMBER,Thanks so much for taking the time to write up this detailed bug report! 🙏 ,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524