html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3686#issuecomment-573455048,https://api.github.com/repos/pydata/xarray/issues/3686,573455048,MDEyOklzc3VlQ29tbWVudDU3MzQ1NTA0OA==,1197350,2020-01-12T20:41:53Z,2020-01-12T20:41:53Z,MEMBER,"Thanks for the useful issue @abarciauskas-bgse and valuable test @dmedv. I believe this is fundamentally a Dask issue. In general, Dask's algorithms do not guarantee numerically identical results for different chunk sizes. Roundoff errors accrue slightly differently based on how the array is split up. These errors are usually acceptable to users. For example, 290.13754 vs 290.13757, the error is in the 8th significant digit, 1 part in 100,00,000. Since there are only 65,536 16-bit integers (the original data type in the netCDF file), this seems more than adequate precision to me. Calling `.mean()` on a dask array is not the same as a checksum. As with all numerical calculations, equality should be verified with a precision appropriate to the data type and algorithm, e.g. using [`assert_allclose`](http://xarray.pydata.org/en/stable/generated/xarray.testing.assert_allclose.html). There appears to be a second issue here related to fill values, but I haven't quite grasped whether we think there is a bug. > I think it would be nice if it were possible to control the mask application in `open_dataset` separately from scale/offset. There may be a reason why these operations are coupled. Would have to look more closely at the code to know for sure.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,548475127