html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6610#issuecomment-1523666774,https://api.github.com/repos/pydata/xarray/issues/6610,1523666774,IC_kwDOAMm_X85a0U9W,2448579,2023-04-26T15:59:06Z,2023-04-26T16:06:17Z,MEMBER,"We voted to move forward with this API: ```python data.groupby({ ""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed ""time"": xr.TimeResampleGrouper(freq=""MS"") }, ) ``` We won't break backwards-compatibility for `da.groupby(other_data_array)` but for any complicated use-cases with `Grouper` the user must add the `by` variable to the xarray object, and refer to it by name in the dictionary as above,","{""total_count"": 4, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 1}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195,https://api.github.com/repos/pydata/xarray/issues/6610,1498463195,IC_kwDOAMm_X85ZULvb,2448579,2023-04-06T04:07:05Z,2023-04-26T15:52:21Z,MEMBER,"Here's a question. In #7561, I implement `Grouper` objects that don't have any information of the variable we're grouping by. So the future API would be: ``` python data.groupby({ ""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed ""time"": xr.TimeResampleGrouper(freq=""MS"") }, ) ``` Does this look OK or do we want to support passing the DataArray or variable name as a `by` kwarg: ```python xr.BinGrouper(by=""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])) ``` This syntax would support passing `DataArray` in `by` so `xr.UniqueGrouper(by=data.y)` for example. Is that an important usecase to support? In #7561, I create new `ResolvedGrouper` objects that do contain `by` as a DataArray always, so it's really a question of exposing that to the user. PS: [Pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) has a `key` kwarg for a column name. So following that would mean ``` python data.groupby([ xr.BinGrouper(""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning xr.UniqueGrouper(""y"", labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed xr.TimeResampleGrouper(""time"", freq=""MS"") ], ) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701 https://github.com/pydata/xarray/issues/6610#issuecomment-1329680642,https://api.github.com/repos/pydata/xarray/issues/6610,1329680642,IC_kwDOAMm_X85PQVEC,2448579,2022-11-28T19:58:29Z,2022-11-28T23:23:42Z,MEMBER,"In https://github.com/xarray-contrib/flox/issues/191 @keewis proposes a much nicer API for multiple variables: ``` python data.groupby( xr.Grouper(by=""x"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])), # binning xr.Grouper(by=data.y, labels=[""a"", ""b"", ""c""]), # categorical, data.y is dask-backed xr.Grouper(by=""time"", freq=""MS""), # resample ) ``` Note [`pd.Grouper`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) uses `key` instead of `by` so that's a possibility too.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701