id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1888576440,I_kwDOAMm_X85wkWO4,8162,Update group by multi index,2448579,open,0,,,0,2023-09-09T04:50:29Z,2023-09-09T04:50:39Z,,MEMBER,,,,"ideally `GroupBy._infer_concat_args()` would return a `xr.Coordinates` object that contains both the coordinate(s) and their (multi-)index to assign to the result (combined) object. The goal is to avoid calling `create_default_index_implicit(coord)` below where `coord` is a `pd.MultiIndex` or a single `IndexVariable` wrapping a multi-index. If `coord` is a `Coordinates` object, we could do `combined = combined.assign_coords(coord)` instead. https://github.com/pydata/xarray/blob/e2b6f3468ef829b8a83637965d34a164bf3bca78/xarray/core/groupby.py#L1573-L1587 There are actually more general issues: - The `group` parameter of Dataset.groupby being a single variable or variable name, it won't be possible to do groupby on a full pandas multi-index once we drop its dimension coordinate (#8143). How can we still support it? Maybe passing a dimension name to `group` and check that there's only one index for that dimension? - How can we support custom, multi-coordinate indexes with groupby? I don't have any practical example in mind, but in theory just passing a single coordinate name as `group` will invalidate the index. Should we drop the index in the result? Or, like suggested above pass a dimension name as group and check the index? _Originally posted by @benbovy in https://github.com/pydata/xarray/issues/8140#issuecomment-1709775666_ ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8162/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue