html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5358#issuecomment-851300846,https://api.github.com/repos/pydata/xarray/issues/5358,851300846,MDEyOklzc3VlQ29tbWVudDg1MTMwMDg0Ng==,30388627,2021-05-31T08:12:22Z,2021-05-31T08:12:22Z,NONE,@dcherian Has this method been improved in [dask_groupby](https://github.com/dcherian/dask_groupby)? Could you provide a simple example we can follow? I got lost in the dask_groupby documentation ...,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,897689314
https://github.com/pydata/xarray/issues/5358#issuecomment-845924589,https://api.github.com/repos/pydata/xarray/issues/5358,845924589,MDEyOklzc3VlQ29tbWVudDg0NTkyNDU4OQ==,30388627,2021-05-21T12:44:39Z,2021-05-21T12:44:39Z,NONE,"@dcherian Thanks! That's simple ;) However, the `groupby_bins` method is a little different from `binned_statistic`.
`binned_statistic`:
> All but the last (righthand-most) bin is half-open. In other words, if bins is [1, 2, 3, 4], then the first bin is [1, 2) (including 1, but excluding 2) and the second [2, 3). The last bin, however, is [3, 4], which includes 4.
`groupby_bins`:
> right (bool, default: True) – Indicates whether the bins include the rightmost edge or not. If right == True (the default), then the bins [1,2,3,4] indicate (1,2], (2,3], (3,4].
So, let's check this shorter example:
```
from scipy.stats import binned_statistic
import numpy as np
import xarray as xr
# --- scipy method ---
x = np.arange(10)
values = x*5
statistics, _, _ = binned_statistic(x, values, statistic='min', bins=10, range=(0, 10))
# --- xarray method ---
x = xr.DataArray(x)
values = xr.DataArray(values)
bin_res = values.groupby_bins('dim_0', bins=np.linspace(0, 10, 10), right=False, include_lowest=True).min()
print('scipy: \n', statistics)
print('xarray: \n', bin_res)
```
Output:
```
scipy:
[ 0. 5. 10. 15. 20. 25. 30. 35. 40. 45.]
xarray:
array([ 0, 10, 15, 20, 25, 30, 35, 40, 45])
Coordinates:
* dim_0_bins (dim_0_bins) object [0.0, 1.111) ... [8.889, 10.0)
```
The scipy method has one more value ...
## Summary
These produce the same results:
```
binned_statistic(x, values, statistic='min', bins=10, range=(0, 10))
values.groupby_bins('dim_0', bins=np.linspace(0, 10, 11), right=False, include_lowest=True).min()
```
Output:
```
scipy:
[ 0. 5. 10. 15. 20. 25. 30. 35. 40. 45.]
xarray:
array([ 0, 5, 10, 15, 20, 25, 30, 35, 40, 45])
Coordinates:
* dim_0_bins (dim_0_bins) object [0.0, 1.0) [1.0, 2.0) ... [9.0, 10.0)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,897689314