html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6610#issuecomment-1523666774,https://api.github.com/repos/pydata/xarray/issues/6610,1523666774,IC_kwDOAMm_X85a0U9W,2448579,2023-04-26T15:59:06Z,2023-04-26T16:06:17Z,MEMBER,"We voted to move forward with this API:
```python
data.groupby({
""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning
""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed
""time"": xr.TimeResampleGrouper(freq=""MS"")
},
)
```
We won't break backwards-compatibility for `da.groupby(other_data_array)` but for any complicated use-cases with `Grouper` the user must add the `by` variable to the xarray object, and refer to it by name in the dictionary as above,","{""total_count"": 4, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 1}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195,https://api.github.com/repos/pydata/xarray/issues/6610,1498463195,IC_kwDOAMm_X85ZULvb,2448579,2023-04-06T04:07:05Z,2023-04-26T15:52:21Z,MEMBER,"Here's a question.
In #7561, I implement `Grouper` objects that don't have any information of the variable we're grouping by. So the future API would be:
``` python
data.groupby({
""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning
""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed
""time"": xr.TimeResampleGrouper(freq=""MS"")
},
)
```
Does this look OK or do we want to support passing the DataArray or variable name as a `by` kwarg:
```python
xr.BinGrouper(by=""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""]))
```
This syntax would support passing `DataArray` in `by` so `xr.UniqueGrouper(by=data.y)` for example. Is that an important usecase to support? In #7561, I create new `ResolvedGrouper` objects that do contain `by` as a DataArray always, so it's really a question of exposing that to the user.
PS: [Pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) has a `key` kwarg for a column name. So following that would mean
``` python
data.groupby([
xr.BinGrouper(""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning
xr.UniqueGrouper(""y"", labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed
xr.TimeResampleGrouper(""time"", freq=""MS"")
],
)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1329680642,https://api.github.com/repos/pydata/xarray/issues/6610,1329680642,IC_kwDOAMm_X85PQVEC,2448579,2022-11-28T19:58:29Z,2022-11-28T23:23:42Z,MEMBER,"In https://github.com/xarray-contrib/flox/issues/191 @keewis proposes a much nicer API for multiple variables:
``` python
data.groupby(
xr.Grouper(by=""x"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning
xr.Grouper(by=data.y, labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed
xr.Grouper(by=""time"", freq=""MS""), # resample
)
```
Note [`pd.Grouper`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) uses `key` instead of `by` so that's a possibility too.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701