html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3686#issuecomment-576422784,https://api.github.com/repos/pydata/xarray/issues/3686,576422784,MDEyOklzc3VlQ29tbWVudDU3NjQyMjc4NA==,15016780,2020-01-20T20:35:47Z,2020-01-20T20:35:47Z,NONE,Closing as using `mask_and_scale=False` produced precise results,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,548475127 https://github.com/pydata/xarray/issues/3686#issuecomment-573458081,https://api.github.com/repos/pydata/xarray/issues/3686,573458081,MDEyOklzc3VlQ29tbWVudDU3MzQ1ODA4MQ==,15016780,2020-01-12T21:17:11Z,2020-01-12T21:17:11Z,NONE,"Thanks @rabernat I would like to use [assert_allclose](http://xarray.pydata.org/en/stable/generated/xarray.testing.assert_allclose.html) to test the output but at first pass it seems that might be prohibitively slow to test for large datasets, do you recommend sampling or other good testing strategies (e.g. to assert the xarray datasets are equal to some precision)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,548475127 https://github.com/pydata/xarray/issues/3686#issuecomment-573444233,https://api.github.com/repos/pydata/xarray/issues/3686,573444233,MDEyOklzc3VlQ29tbWVudDU3MzQ0NDIzMw==,15016780,2020-01-12T18:37:59Z,2020-01-12T18:37:59Z,NONE,"@dmedv Thanks for this, it all makes sense to me and I see the same results, however I wasn't able to ""convert back"" using `scale_factor` and `add_offset` ``` from netCDF4 import Dataset d = Dataset(fileObjs[0]) v = d.variables['analysed_sst'] print(""Result with mask_and_scale=True"") ds_unchunked = xr.open_dataset(fileObjs[0]) print(ds_unchunked.analysed_sst.sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(""Result with mask_and_scale=False"") ds_unchunked = xr.open_dataset(fileObjs[0], mask_and_scale=False) scaled = ds_unchunked.analysed_sst * v.scale_factor + v.add_offset scaled.sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values ``` ^^ That returns a different result than what I expect. I wonder if this is because of the `_FillValue` missing from trying to convert back. _However_ this led me to another seemingly related issue: https://github.com/pydata/xarray/issues/2304 Loss of precision seems to be the key here, so coercing the `float32`s to `float64`s appears to get the same results from both chunked and unchunked versions - but still not ``` print(""results from unchunked dataset"") ds_unchunked = xr.open_mfdataset(fileObjs, combine='by_coords') ds_unchunked['analysed_sst'] = ds_unchunked['analysed_sst'].astype(np.float64) print(ds_unchunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(f""results from chunked dataset using {chunks}"") ds_chunked = xr.open_mfdataset(fileObjs, chunks=chunks, combine='by_coords') ds_chunked['analysed_sst'] = ds_chunked['analysed_sst'].astype(np.float64) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) print(""results from chunked dataset using 'auto'"") ds_chunked = xr.open_mfdataset(fileObjs, chunks={'time': 'auto', 'lat': 'auto', 'lon': 'auto'}, combine='by_coords') ds_chunked['analysed_sst'] = ds_chunked['analysed_sst'].astype(np.float64) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) ``` returns: ``` results from unchunked dataset 290.1375818862207 results from chunked dataset using {'time': 1, 'lat': 1799, 'lon': 3600} 290.1375818862207 results from chunked dataset using 'auto' 290.1375818862207 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,548475127