html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1346#issuecomment-464115604,https://api.github.com/repos/pydata/xarray/issues/1346,464115604,MDEyOklzc3VlQ29tbWVudDQ2NDExNTYwNA==,1217238,2019-02-15T16:39:08Z,2019-02-15T16:39:08Z,MEMBER,"The difference is that Bottleneck does the sum in the naive way, whereas NumPy uses the more numerically stable [pairwise summation](https://en.wikipedia.org/wiki/Pairwise_summation).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353
https://github.com/pydata/xarray/issues/1346#issuecomment-456173428,https://api.github.com/repos/pydata/xarray/issues/1346,456173428,MDEyOklzc3VlQ29tbWVudDQ1NjE3MzQyOA==,1217238,2019-01-21T19:09:43Z,2019-01-21T19:09:43Z,MEMBER,"> Would it be worth adding a warning (until the right solution is found) if someone is doing `.mean()` on a `DataArray` which is `float32`?
I would rather pick option (1) above, that is, ""Stop using bottleneck on float32 arrays""","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353
https://github.com/pydata/xarray/issues/1346#issuecomment-290851733,https://api.github.com/repos/pydata/xarray/issues/1346,290851733,MDEyOklzc3VlQ29tbWVudDI5MDg1MTczMw==,1217238,2017-03-31T22:55:18Z,2017-03-31T22:55:18Z,MEMBER,"@matteodefelice you didn't decide on float32, but your data is stored that way. It's really hard to make choices about numerical precision for computations automatically: if we converted automatically to float64, somebody else would be complaining about unexpected memory usage :).
Looking at our options, we could:
1. Stop using bottleneck on float32 arrays, or provide a flag or option to disable using bottleneck. This is not ideal, because bottleneck is much faster.
2. Automatically convert float32 arrays to float64 before doing aggregations. This is not ideal, because it could significant increase memory requirements.
3. Add a `dtype` option for aggregations (like NumPy) and consider defaulting to `dype=np.float64` when doing aggregations on float32 arrays. I would generally be happy with this, but bottleneck currently doesn't provide the option currently.
4. Write a higher precision algorithm for bottleneck's `mean`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353
https://github.com/pydata/xarray/issues/1346#issuecomment-290760342,https://api.github.com/repos/pydata/xarray/issues/1346,290760342,MDEyOklzc3VlQ29tbWVudDI5MDc2MDM0Mg==,1217238,2017-03-31T16:24:04Z,2017-03-31T16:24:04Z,MEMBER,"Yes, this is probably related to the fact that `.mean()` in xarray uses bottleneck if available, and bottleneck has a slightly different mean implementation, quite possibly with a less numerically stable algorithm.
The fact that the dtype is float32 is a sign that this is probably a numerical precision issue. Try casting with `.astype(np.float64)` and see if the problem goes away.
If you really cared about performance using float32, the other thing to do to improve conditioning is to subtract and add a number close to the mean, e.g., `(ds.var167 - 270).mean() + 270`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,218459353