html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6610#issuecomment-1523666774,https://api.github.com/repos/pydata/xarray/issues/6610,1523666774,IC_kwDOAMm_X85a0U9W,2448579,2023-04-26T15:59:06Z,2023-04-26T16:06:17Z,MEMBER,"We voted to move forward with this API: ```python data.groupby({ ""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed ""time"": xr.TimeResampleGrouper(freq=""MS"") }, ) ``` We won't break backwards-compatibility for `da.groupby(other_data_array)` but for any complicated use-cases with `Grouper` the user must add the `by` variable to the xarray object, and refer to it by name in the dictionary as above,","{""total_count"": 4, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 1}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195,https://api.github.com/repos/pydata/xarray/issues/6610,1498463195,IC_kwDOAMm_X85ZULvb,2448579,2023-04-06T04:07:05Z,2023-04-26T15:52:21Z,MEMBER,"Here's a question. In #7561, I implement `Grouper` objects that don't have any information of the variable we're grouping by. So the future API would be: ``` python data.groupby({ ""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed ""time"": xr.TimeResampleGrouper(freq=""MS"") }, ) ``` Does this look OK or do we want to support passing the DataArray or variable name as a `by` kwarg: ```python xr.BinGrouper(by=""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])) ``` This syntax would support passing `DataArray` in `by` so `xr.UniqueGrouper(by=data.y)` for example. Is that an important usecase to support? In #7561, I create new `ResolvedGrouper` objects that do contain `by` as a DataArray always, so it's really a question of exposing that to the user. PS: [Pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) has a `key` kwarg for a column name. So following that would mean ``` python data.groupby([ xr.BinGrouper(""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning xr.UniqueGrouper(""y"", labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed xr.TimeResampleGrouper(""time"", freq=""MS"") ], ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1341296800,https://api.github.com/repos/pydata/xarray/issues/6610,1341296800,IC_kwDOAMm_X85P8pCg,1217238,2022-12-07T17:12:05Z,2022-12-07T17:12:05Z,MEMBER,"I also like the idea of creating specific Grouper objects for different types of selection, e.g., `UniqueGrouper` (the default), `BinGrouper`, `TimeResampleGrouper`, etc.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1341289782,https://api.github.com/repos/pydata/xarray/issues/6610,1341289782,IC_kwDOAMm_X85P8nU2,35968931,2022-12-07T17:07:08Z,2022-12-07T17:07:08Z,MEMBER,Using `xr.Grouper` has the advantage that you don't have to start guessing about whether or not the user wanted some complicated behaviour (especially if their input is slightly wrong somehow and you have to raise an informative error). Simple defaults would get left as is and complex use cases can be explicit and opt-in.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1329680642,https://api.github.com/repos/pydata/xarray/issues/6610,1329680642,IC_kwDOAMm_X85PQVEC,2448579,2022-11-28T19:58:29Z,2022-11-28T23:23:42Z,MEMBER,"In https://github.com/xarray-contrib/flox/issues/191 @keewis proposes a much nicer API for multiple variables: ``` python data.groupby( xr.Grouper(by=""x"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning xr.Grouper(by=data.y, labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed xr.Grouper(by=""time"", freq=""MS""), # resample ) ``` Note [`pd.Grouper`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) uses `key` instead of `by` so that's a possibility too.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701