issues: 351000813

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
351000813	MDU6SXNzdWUzNTEwMDA4MTM=	2370	Inconsistent results when calculating sums on float32 arrays w/ bottleneck installed	5179430	closed	0			6	2018-08-15T23:18:41Z	2020-08-17T00:07:12Z	2020-08-17T00:07:12Z	CONTRIBUTOR				Code Sample, a copy-pastable example if possible Data file used is here: test.nc.zip Output from each statement is commented out. ```python import xarray as xr ds = xr.open_dataset('test.nc') ds.cold_rad_cnts.min() 13038. ds.cold_rad_cnts.max() 13143. ds.cold_rad_cnts.mean() 12640.583984 ds.cold_rad_cnts.std() 455.035156 ds.cold_rad_cnts.sum() 4.472997e+10 ``` Problem description As you can see above, the mean falls outside the range of the data, and the standard deviation is nearly two orders of magnitude higher than it should be. This is because a significant loss of precision is occurring when using bottleneck's `nansum()` on data with a `float32` dtype. I demonstrated this effect here: https://github.com/kwgoodman/bottleneck/issues/193. Naturally, this means that converting the data to `float64` or any `int` dtype will give the correct result, as well as using numpy's built-in functions instead or uninstalling bottleneck. An example is shown below. Expected Output ```python In [8]: import numpy as np In [9]: np.nansum(ds.cold_rad_cnts) Out[9]: 46357123000.0 In [10]: np.nanmean(ds.cold_rad_cnts) Out[10]: 13100.413 In [11]: np.nanstd(ds.cold_rad_cnts) Out[11]: 8.158843 ``` Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.8 pandas: 0.23.4 numpy: 1.15.0 scipy: 1.1.0 netCDF4: 1.4.1 h5netcdf: 0.6.1 h5py: 2.8.0 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.18.2 distributed: 1.22.1 matplotlib: None cartopy: None seaborn: None setuptools: 40.0.0 pip: 10.0.1 conda: None pytest: None IPython: 6.5.0 sphinx: None Unfortunately this will probably not be fixed downstream anytime soon, so I think it would be nice if xarray provided some sort of automatic workaround for this rather than having to remember to manually convert my data if it's `float32`. I am thinking making `float64` the default (as discussed in #2304 ) would be nice but perhaps it might also be good if there was at least a warning whenever bottleneck's `nansum()` is used on `float32` arrays.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2370/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

1 row from issues_id in issues_labels
6 rows from issue in issue_comments