html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/7039#issuecomment-1460166694,https://api.github.com/repos/pydata/xarray/issues/7039,1460166694,IC_kwDOAMm_X85XCGAm,35741277,2023-03-08T13:37:10Z,2023-03-08T13:37:47Z,NONE,"Thanks for that note. I have a bunch of variables, like precipitation type, where that would be totally fine. Definitely looking to save on disk space, so may try to recompute the scale_factor and add_offset on other variables as suggested. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1460116369,https://api.github.com/repos/pydata/xarray/issues/7039,1460116369,IC_kwDOAMm_X85XB5uR,35741277,2023-03-08T13:00:21Z,2023-03-08T13:00:21Z,NONE,Thanks for the alternative @veenstrajelmer. I'll give it a try on my end. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1433334559,https://api.github.com/repos/pydata/xarray/issues/7039,1433334559,IC_kwDOAMm_X85VbvMf,35741277,2023-02-16T16:09:59Z,2023-02-16T16:09:59Z,NONE,Thanks for flagging the issue again. I've been using the same workaround of removing the dtype before writing to a zarr/netcdf. It's an extra step but has worked for me so far. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248429163,https://api.github.com/repos/pydata/xarray/issues/7039,1248429163,IC_kwDOAMm_X85KaYRr,35741277,2022-09-15T18:01:43Z,2022-09-19T22:20:15Z,NONE,"One last observation. If I just remove dtype from the original encoding and apply it to the dataset before writing to a netcdf, it works fine. Otherwise, I have the issue if I leave dtype in.
```python
# This works
encoding = {'original_shape': (720, 109, 245),
'missing_value': -32767,
'_FillValue': -32767,
'scale_factor': 0.0009673806360857793,
'add_offset': 282.08577424425226}
```
```python
# This does not work
encoding = {'original_shape': (720, 109, 245),
'missing_value': -32767,
'dtype': 'int16', # the original form says it should be 'dtype': dtype('int16'), but this causes an error for me, whereas this form works fine to change between data types
'_FillValue': -32767,
'scale_factor': 0.0009673806360857793,
'add_offset': 282.08577424425226}
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248293772,https://api.github.com/repos/pydata/xarray/issues/7039,1248293772,IC_kwDOAMm_X85KZ3OM,35741277,2022-09-15T15:54:17Z,2022-09-19T22:19:36Z,NONE,"That figure is basically what I am getting. Perhaps I designed the MRE poorly, however, I am curious as to what exactly from the encoding introduces the noise (I still need to read through the documentation more thoroughly)? If I don't apply the original encoding, I get a straight line at 0 for the difference plot.
With that being said, if you are willing to try a test with the actual ERA5 data, I've attached it here via a box link. I went back and figured out I need at least several files to get large differences. Oddly enough, if I use only 2 files, the difference looks more like noise (+/- 0.0005). If I only open a single file, no difference. If I add a couple more files, the differences become quite large.
Data: https://epri.box.com/s/spw9plf77lrjj1xz2spmwd34b5ls9dea
```python
import xarray as xr
import matplotlib.pyplot as plt
# Open original time series
ERA5_t2m = xr.open_mfdataset(r'...\Test\T2m_*' + '.nc') # open 4 files
# Save time series as netcdf
ERA5_t2m.to_netcdf(r""...\Test\Phx_Temperature_to_netcdf.nc"") # save 4 files
# open bad netcdf
ERA5_t2m_bad = xr.open_dataset(r'...\Test\Phx_Temperature_to_netcdf.nc')
# Lat and lon for Phx
lats = [33.35]
lons = [-112.86]
# plot the difference between the same point from the two files
plt.plot(ERA5_t2m.t2m.sel(latitude = lats[0], longitude = lons[0], method='nearest')
- ERA5_t2m_bad.t2m.sel(latitude = lats[0], longitude = lons[0], method='nearest'))
```

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7039#issuecomment-1248366193,https://api.github.com/repos/pydata/xarray/issues/7039,1248366193,IC_kwDOAMm_X85KaI5x,35741277,2022-09-15T16:55:08Z,2022-09-15T16:55:08Z,NONE,"Thanks for the explanation. Makes a lot more sense now! All figures I've attached are from the real ERA5 data. The figure I attached in my most recent comment with the alternative MRE (with the ERA5 data) is what I get when I run that code with the data I provided in the test folder.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1373352524
https://github.com/pydata/xarray/issues/7029#issuecomment-1247020185,https://api.github.com/repos/pydata/xarray/issues/7029,1247020185,IC_kwDOAMm_X85KVASZ,35741277,2022-09-14T16:30:12Z,2022-09-14T16:30:12Z,NONE,"Ryan answered this in the discussions: https://github.com/pydata/xarray/discussions/7025#discussion-4385791
The solution is to, in this case, specify ERA5_t2m.t2m.encoding = {} before saving as a netcdf. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1371711272