html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6610#issuecomment-1523666774,https://api.github.com/repos/pydata/xarray/issues/6610,1523666774,IC_kwDOAMm_X85a0U9W,2448579,2023-04-26T15:59:06Z,2023-04-26T16:06:17Z,MEMBER,"We voted to move forward with this API:
```python
data.groupby({
	""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])),  # binning
    ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]),  # categorical, data.y is dask-backed
    ""time"": xr.TimeResampleGrouper(freq=""MS"")
	},
)
```

We won't break backwards-compatibility for `da.groupby(other_data_array)` but for any complicated use-cases with `Grouper` the user must add the `by` variable to the xarray object, and refer to it by name in the dictionary as above,","{""total_count"": 4, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 1, ""eyes"": 1}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195,https://api.github.com/repos/pydata/xarray/issues/6610,1498463195,IC_kwDOAMm_X85ZULvb,2448579,2023-04-06T04:07:05Z,2023-04-26T15:52:21Z,MEMBER,"Here's a question.

In #7561, I implement `Grouper` objects that don't have any information of the variable we're grouping by. So the future API would be:

``` python
data.groupby({
	""x0"": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])),  # binning
    ""y"": xr.UniqueGrouper(labels=[""a"", ""b"", ""c""]),  # categorical, data.y is dask-backed
    ""time"": xr.TimeResampleGrouper(freq=""MS"")
	},
)
```

Does this look OK or do we want to support passing the DataArray or variable name as a `by` kwarg:  
```python
xr.BinGrouper(by=""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""]))
``` 

This syntax would support passing `DataArray` in `by` so `xr.UniqueGrouper(by=data.y)` for example. Is that an important usecase to support? In #7561, I create new `ResolvedGrouper` objects that do  contain `by` as a DataArray always, so it's really a question of exposing that to the user.

PS: [Pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) has a `key` kwarg for a column name. So following that would mean

``` python
data.groupby([
	xr.BinGrouper(""x0"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])),  # binning
    xr.UniqueGrouper(""y"", labels=[""a"", ""b"", ""c""]),  # categorical, data.y is dask-backed
    xr.TimeResampleGrouper(""time"", freq=""MS"")
	],
)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1341296800,https://api.github.com/repos/pydata/xarray/issues/6610,1341296800,IC_kwDOAMm_X85P8pCg,1217238,2022-12-07T17:12:05Z,2022-12-07T17:12:05Z,MEMBER,"I also like the idea of creating specific Grouper objects for different types of selection, e.g., `UniqueGrouper` (the default), `BinGrouper`, `TimeResampleGrouper`, etc.","{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1341289782,https://api.github.com/repos/pydata/xarray/issues/6610,1341289782,IC_kwDOAMm_X85P8nU2,35968931,2022-12-07T17:07:08Z,2022-12-07T17:07:08Z,MEMBER,Using `xr.Grouper` has the advantage that you don't have to start guessing about whether or not the user wanted some complicated behaviour (especially if their input is slightly wrong somehow and you have to raise an informative error). Simple defaults would get left as is and complex use cases can be explicit and opt-in.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1329680642,https://api.github.com/repos/pydata/xarray/issues/6610,1329680642,IC_kwDOAMm_X85PQVEC,2448579,2022-11-28T19:58:29Z,2022-11-28T23:23:42Z,MEMBER,"In https://github.com/xarray-contrib/flox/issues/191 @keewis proposes a much nicer API for multiple variables:

``` python
data.groupby(
    xr.Grouper(by=""x"", bins=pd.IntervalIndex.from_breaks(coords[""x_vertices""])),  # binning
    xr.Grouper(by=data.y, labels=[""a"", ""b"", ""c""]),  # categorical, data.y is dask-backed
    xr.Grouper(by=""time"", freq=""MS""),  # resample
)
```

Note [`pd.Grouper`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Grouper.html) uses `key` instead of `by` so that's a possibility too.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701
https://github.com/pydata/xarray/issues/6610#issuecomment-1128588208,https://api.github.com/repos/pydata/xarray/issues/6610,1128588208,IC_kwDOAMm_X85DROOw,22245117,2022-05-17T08:40:04Z,2022-05-17T15:04:04Z,CONTRIBUTOR,"I'm getting errors with multi-indexes and `flox`. Is this expected and related to this issue, or should I open a separate issue?

```python
import numpy as np

import xarray as xr

ds = xr.Dataset(
    dict(a=((""z"",), np.ones(10))),
    coords=dict(b=((""z""), np.arange(2).repeat(5)), c=((""z""), np.arange(5).repeat(2))),
).set_index(bc=[""b"", ""c""])
grouped = ds.groupby(""bc"")

with xr.set_options(use_flox=False):
    grouped.sum()  # OK

with xr.set_options(use_flox=True):
    grouped.sum()  # Error
```
```
Traceback (most recent call last):
  File ""/Users/mattia/MyGit/test.py"", line 15, in <module>
    grouped.sum()
  File ""/Users/mattia/MyGit/xarray/xarray/core/_reductions.py"", line 2763, in sum
    return self._flox_reduce(
  File ""/Users/mattia/MyGit/xarray/xarray/core/groupby.py"", line 661, in _flox_reduce
    result = xarray_reduce(
  File ""/Users/mattia/mambaforge/envs/sarsen_dev/lib/python3.10/site-packages/flox/xarray.py"", line 373, in xarray_reduce
    actual[k] = v.expand_dims(missing_group_dims)
  File ""/Users/mattia/MyGit/xarray/xarray/core/dataset.py"", line 1427, in __setitem__
    self.update({key: value})
  File ""/Users/mattia/MyGit/xarray/xarray/core/dataset.py"", line 4432, in update
    merge_result = dataset_update_method(self, other)
  File ""/Users/mattia/MyGit/xarray/xarray/core/merge.py"", line 1070, in dataset_update_method
    return merge_core(
  File ""/Users/mattia/MyGit/xarray/xarray/core/merge.py"", line 722, in merge_core
    aligned = deep_align(
  File ""/Users/mattia/MyGit/xarray/xarray/core/alignment.py"", line 824, in deep_align
    aligned = align(
  File ""/Users/mattia/MyGit/xarray/xarray/core/alignment.py"", line 761, in align
    aligner.align()
  File ""/Users/mattia/MyGit/xarray/xarray/core/alignment.py"", line 550, in align
    self.assert_unindexed_dim_sizes_equal()
  File ""/Users/mattia/MyGit/xarray/xarray/core/alignment.py"", line 450, in assert_unindexed_dim_sizes_equal
    raise ValueError(
ValueError: cannot reindex or align along dimension 'bc' because of conflicting dimension sizes: {10, 6} (note: an index is found along that dimension with size=10)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1236174701