issue_comments: 573455048

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/3686#issuecomment-573455048	https://api.github.com/repos/pydata/xarray/issues/3686	573455048	MDEyOklzc3VlQ29tbWVudDU3MzQ1NTA0OA==	1197350	2020-01-12T20:41:53Z	2020-01-12T20:41:53Z	MEMBER	Thanks for the useful issue @abarciauskas-bgse and valuable test @dmedv. I believe this is fundamentally a Dask issue. In general, Dask's algorithms do not guarantee numerically identical results for different chunk sizes. Roundoff errors accrue slightly differently based on how the array is split up. These errors are usually acceptable to users. For example, 290.13754 vs 290.13757, the error is in the 8th significant digit, 1 part in 100,00,000. Since there are only 65,536 16-bit integers (the original data type in the netCDF file), this seems more than adequate precision to me. Calling `.mean()` on a dask array is not the same as a checksum. As with all numerical calculations, equality should be verified with a precision appropriate to the data type and algorithm, e.g. using `assert_allclose`. There appears to be a second issue here related to fill values, but I haven't quite grasped whether we think there is a bug. I think it would be nice if it were possible to control the mask application in `open_dataset` separately from scale/offset. There may be a reason why these operations are coupled. Would have to look more closely at the code to know for sure.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		548475127