html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1346#issuecomment-464338041,https://api.github.com/repos/pydata/xarray/issues/1346,464338041,MDEyOklzc3VlQ29tbWVudDQ2NDMzODA0MQ==,691772,2019-02-16T11:20:20Z,2019-02-16T11:20:20Z,CONTRIBUTOR,"Oh yes, of course! I've underestimated the low precision of float32 values above 2**24. Thanks for the hint.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353 https://github.com/pydata/xarray/issues/1346#issuecomment-463324373,https://api.github.com/repos/pydata/xarray/issues/1346,463324373,MDEyOklzc3VlQ29tbWVudDQ2MzMyNDM3Mw==,691772,2019-02-13T19:02:52Z,2019-02-16T10:53:51Z,CONTRIBUTOR,"I think (!) xarray is not effected any longer, but pandas is. Bisecting the GIT history leads to commit 0b9ab2d1, which means that xarray >= v0.10.9 should not be affected. Uninstalling bottleneck is also a valid workaround. Bottleneck's documentation explicitly mentions that [no error is raised in case of an overflow](https://kwgoodman.github.io/bottleneck-doc/reference.html?highlight=overflow#bottleneck.nanmean). But it seams to be very evil behavior, so it might be worth reporting upstream. What do you think? (I think kwgoodman/bottleneck#164 is something different, isn't it?) **Edit:** this is not an overflow. It's a numerical error by not applying [pairwise summation](https://en.wikipedia.org/wiki/Pairwise_summation). A couple of minimal examples: ```python >>> import numpy as np >>> import pandas as pd >>> import xarray as xr >>> import bottleneck as bn >>> bn.nanmean(np.ones(2**25, dtype=np.float32)) 0.5 >>> pd.Series(np.ones(2**25, dtype=np.float32)).mean() 0.5 >>> xr.DataArray(np.ones(2**25, dtype=np.float32)).mean() # not affected for this version array(1., dtype=float32) ``` Done with the following versions: ```bash $ pip3 freeze Bottleneck==1.2.1 numpy==1.16.1 pandas==0.24.1 xarray==0.11.3 ... ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353 https://github.com/pydata/xarray/issues/1346#issuecomment-464016154,https://api.github.com/repos/pydata/xarray/issues/1346,464016154,MDEyOklzc3VlQ29tbWVudDQ2NDAxNjE1NA==,691772,2019-02-15T11:41:36Z,2019-02-15T11:41:36Z,CONTRIBUTOR,"Oh hm, I think I didn't really understand what happens in `bottleneck.nanmean()`. I understand that integers can overflow and that float32 have varying absolute precision. The max float32 3.4E+38 is not hit here. So how can the mean of a list of ones be 0.5? Isn't this what bottleneck is doing? Summing up a bunch of float32 values and then dividing by the length? ``` >>> d = np.ones(2**25, dtype=np.float32) >>> d.sum()/np.float32(len(d)) 1.0 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353