home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 573444233

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3686#issuecomment-573444233 https://api.github.com/repos/pydata/xarray/issues/3686 573444233 MDEyOklzc3VlQ29tbWVudDU3MzQ0NDIzMw== 15016780 2020-01-12T18:37:59Z 2020-01-12T18:37:59Z NONE

@dmedv Thanks for this, it all makes sense to me and I see the same results, however I wasn't able to "convert back" using scale_factor and add_offset ``` from netCDF4 import Dataset

d = Dataset(fileObjs[0]) v = d.variables['analysed_sst']

print("Result with mask_and_scale=True") ds_unchunked = xr.open_dataset(fileObjs[0]) print(ds_unchunked.analysed_sst.sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values)

print("Result with mask_and_scale=False") ds_unchunked = xr.open_dataset(fileObjs[0], mask_and_scale=False) scaled = ds_unchunked.analysed_sst * v.scale_factor + v.add_offset scaled.sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values `` ^^ That returns a different result than what I expect. I wonder if this is because of the_FillValue` missing from trying to convert back.

However this led me to another seemingly related issue: https://github.com/pydata/xarray/issues/2304

Loss of precision seems to be the key here, so coercing the float32s to float64s appears to get the same results from both chunked and unchunked versions - but still not

``` print("results from unchunked dataset") ds_unchunked = xr.open_mfdataset(fileObjs, combine='by_coords') ds_unchunked['analysed_sst'] = ds_unchunked['analysed_sst'].astype(np.float64) print(ds_unchunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values)

print(f"results from chunked dataset using {chunks}") ds_chunked = xr.open_mfdataset(fileObjs, chunks=chunks, combine='by_coords') ds_chunked['analysed_sst'] = ds_chunked['analysed_sst'].astype(np.float64) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values)

print("results from chunked dataset using 'auto'") ds_chunked = xr.open_mfdataset(fileObjs, chunks={'time': 'auto', 'lat': 'auto', 'lon': 'auto'}, combine='by_coords') ds_chunked['analysed_sst'] = ds_chunked['analysed_sst'].astype(np.float64) print(ds_chunked.analysed_sst[1,:,:].sel(lat=slice(20,50),lon=slice(-170,-110)).mean().values) ```

returns: results from unchunked dataset 290.1375818862207 results from chunked dataset using {'time': 1, 'lat': 1799, 'lon': 3600} 290.1375818862207 results from chunked dataset using 'auto' 290.1375818862207

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  548475127
Powered by Datasette · Queries took 0.753ms · About: xarray-datasette