html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/324#issuecomment-1054569287,https://api.github.com/repos/pydata/xarray/issues/324,1054569287,IC_kwDOAMm_X84-23NH,2448579,2022-02-28T19:03:17Z,2022-02-28T19:03:17Z,MEMBER,"I have this almost ready in [flox](https://github.com/dcherian/flox/pull/76/) (needs more tests). So we should be able to do this soon.
In the mean time note that we can view grouping over multiple variables as a ""factorization"" (group identification) problem for aggregations. That means you can
1. use `pd.factorize`, `pd.cut`, `np.searchsorted` or `np.bincount` to convert each `by` variable to an integer code,
2. then use `np.ravel_multi_index` to combine the codes to a single variable `idx`
3. Group by `idx` and accumulate
4. use `np.unravel_index` (or just a simple `np.reshape`) to convert the single grouped dimension to a multiple dimensions.
5. Construct output coordinate arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-531964854,https://api.github.com/repos/pydata/xarray/issues/324,531964854,MDEyOklzc3VlQ29tbWVudDUzMTk2NDg1NA==,1217238,2019-09-16T21:26:21Z,2019-09-16T21:26:21Z,MEMBER,Still relevant.,"{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-336983333,https://api.github.com/repos/pydata/xarray/issues/324,336983333,MDEyOklzc3VlQ29tbWVudDMzNjk4MzMzMw==,1217238,2017-10-16T18:24:33Z,2017-10-16T18:24:33Z,MEMBER,"> Is use case 1 (Multiple groupby arguments along a single dimension) being held back for use case 2 (Multiple groupby arguments along different dimensions)? Use case 1 would be very useful by itself.
No, I think the biggest issue is that grouping variables into a `MultiIndex` on the result sort of works (with the current PR https://github.com/pydata/xarray/pull/924), but it's very easy to end up with weird conflicts between coordinates / MultiIndex levels that are hard to resolve right now within the xarray data model. Probably it would be best to resolve https://github.com/pydata/xarray/issues/1603 first, which will make this much easier.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-131891348,https://api.github.com/repos/pydata/xarray/issues/324,131891348,MDEyOklzc3VlQ29tbWVudDEzMTg5MTM0OA==,5356122,2015-08-17T17:04:44Z,2015-08-17T17:04:44Z,MEMBER,"For (2) I think it makes sense to extend the existing groupby to deal with multiple dimensions. Ie, let it take an iterable of dimension names.
```
>>> darray.groupby(['lat', 'lon'])
```
Then we'd have something similar to the [SQL](http://www.w3schools.com/sql/sql_groupby.asp) groupby, which is a good thing.
By the way, in #527 we were considering using this approach to make the faceted plots on both rows and columns.
","{""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-131878081,https://api.github.com/repos/pydata/xarray/issues/324,131878081,MDEyOklzc3VlQ29tbWVudDEzMTg3ODA4MQ==,2443309,2015-08-17T16:20:14Z,2015-08-17T16:20:14Z,MEMBER,"Agreed, we have two use cases here.
For (1), can we just use the pandas grouping infrastructure. We just need to allow `xray.DataArray.groupby` to support an iterable and `pandas.Grouper` objects. I personally don't like the MultiIndex format and prefer to unstack the grouper operations when possible. In xray, I think we can justify going that route since we support N-D labeled dimensions much better than pandas.
For (2), I'll need to think a bit more about how this would work. Do we add a groupby method to `DataArrayGroupBy`? That sounds messy. Maybe we need to write a N-D grouper object?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-131599877,https://api.github.com/repos/pydata/xarray/issues/324,131599877,MDEyOklzc3VlQ29tbWVudDEzMTU5OTg3Nw==,2443309,2015-08-16T18:51:05Z,2015-08-17T16:07:41Z,MEMBER,"@shoyer -
I want to look into putting a PR together for this. I'm looking for the same functionality that you get with a pandas Series or DataFrame:
``` Python
data.groupby([lambda x: x.hour, lambda x: x.timetuple().tm_yday]).mean()
```
The motivation comes in making a [Hovmoller diagram](https://en.wikipedia.org/wiki/Hovm%C3%B6ller_diagram). What we need is this functionality:
``` Python
da.groupby(['time.hour', 'time.dayofyear']).mean().plot()
```
If you can point me in the right direction, I'll see if I can put something together.
","{""total_count"": 7, ""+1"": 7, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200
https://github.com/pydata/xarray/issues/324#issuecomment-131644079,https://api.github.com/repos/pydata/xarray/issues/324,131644079,MDEyOklzc3VlQ29tbWVudDEzMTY0NDA3OQ==,1217238,2015-08-17T00:13:47Z,2015-08-17T00:13:47Z,MEMBER,"@jhamman For your use case, both hour and dayofyear are along the time dimension, so arguably the result should be 1D with a MultiIndex instead of 2D. So it might make more sense to start with that, and then layer on stack/unstack or pivot functionality.
I guess there are two related use cases here:
1. Multiple groupby arguments along a single dimension (pandas does this one already)
2. Multiple groupby arguments along different dimensions (pandas doesn't do this one).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,58117200