id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 924676925,MDU6SXNzdWU5MjQ2NzY5MjU=,5490,"Nan/ changed values in output when only reading data, saving and reading again",56541075,closed,0,,,9,2021-06-18T08:35:09Z,2023-09-13T13:38:33Z,2023-09-13T13:38:32Z,NONE,,,,"**What happened**: When combining monthly ERA5 data and saving it individually for single locations, different values/nan values appear when reading the single location file back in. **What you expected to happen**: Both should be the same. This works, e.g. when only one month is read. **Minimal Complete Verifiable Example**: ```python import xarray as xr #using version 0.18.2 import numpy as np import dask # only as many threads as requested CPUs | only one to be requested, more threads don't seem to be used dask.config.set(scheduler='synchronous') # this is used only because of the Cluster I work on, but keeping it here in case it is relevant model_level_file_name_format = ""{:d}_europe_{:d}_130_131_132_133_135.nc"" ml_files = [model_level_file_name_format.format(2012, 9), model_level_file_name_format.format(2012, 10)] ds = xr.open_mfdataset(ml_files, decode_times=True) # Select single location data lons = ds['longitude'].values lats = ds['latitude'].values i_lat, i_lon = 27,30 ds_loc = ds.sel(latitude=lats[i_lat], longitude=lons[i_lon]) # Save to file ds_loc.to_netcdf('europe_i_lat_{i_lat}_i_lon_{i_lon}.nc'.format(i_lat=i_lat, i_lon=i_lon)) # Read in again ds_loc_1 = xr.open_dataset('europe_i_lat_{i_lat}_i_lon_{i_lon}.nc'.format(i_lat=i_lat, i_lon=i_lon), decode_times=True) print('Test all q values same: ', np.all(ds_loc.q.values == ds_loc_1.q.values)) ``` **Anything else we need to know?**: I tested this using these two months - many times saving the output works, or the values are slightly different (in the 6th digit). Using a larger timespan (2010-2012) even nan values appear. This issue is not clearly restricted to the q variable, I've not yet found the pattern. I've included a more detailed assessment (output, data, code) - only one month: no discrepancies - two months: discrepancies (in the second month) - 2010-2013: discrepancies and nan values at https://uni-bonn.sciebo.de/s/OLHhid8zJg65IFB I'm not sure where the issue might come from, but as the data is read in correctly at first, it does not seem to be on that side - which would then only leave the process of writing the netcdf output in xarray. I've tested this for a few years and for two months I always get the result, that not all q values are the same. I'm not sure where the problem might be, so I'm not sure where to start for a more minimal example. Hope this is ok. Cheers, Lavinia **Environment**: INSTALLED VERSIONS ------------------ commit: None python: 3.9.4 | packaged by conda-forge | (default, May 10 2021, 22:13:33) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1160.25.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.8.0 xarray: 0.18.2 pandas: 1.2.4 numpy: 1.20.3 scipy: 1.6.3 netCDF4: 1.5.6 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.06.0 distributed: 2021.06.0 matplotlib: 3.4.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20210108 pip: 21.1.2 conda: None pytest: None IPython: None sphinx: None ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5490/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue