html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2852#issuecomment-1101512178,https://api.github.com/repos/pydata/xarray/issues/2852,1101512178,IC_kwDOAMm_X85Bp73y,2448579,2022-04-18T15:45:41Z,2022-04-18T15:45:41Z,MEMBER,"You can do this with [flox](https://flox.readthedocs.io/en/latest/generated/flox.xarray.xarray_reduce.html) now. Eventually we can update xarray to support grouping by a dask variable. The limitation will be that the user will have to provide ""expected groups"" so that we can construct the output coordinate.","{""total_count"": 2, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 2, ""rocket"": 0, ""eyes"": 0}",,425320466 https://github.com/pydata/xarray/issues/2852#issuecomment-653016746,https://api.github.com/repos/pydata/xarray/issues/2852,653016746,MDEyOklzc3VlQ29tbWVudDY1MzAxNjc0Ng==,1197350,2020-07-02T13:48:39Z,2020-07-02T13:48:39Z,MEMBER,👀 cc @chiaral ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,425320466 https://github.com/pydata/xarray/issues/2852#issuecomment-478621867,https://api.github.com/repos/pydata/xarray/issues/2852,478621867,MDEyOklzc3VlQ29tbWVudDQ3ODYyMTg2Nw==,1217238,2019-04-01T15:16:30Z,2019-04-01T15:16:30Z,MEMBER,Roughly how many unique labels do you have?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,425320466 https://github.com/pydata/xarray/issues/2852#issuecomment-478563375,https://api.github.com/repos/pydata/xarray/issues/2852,478563375,MDEyOklzc3VlQ29tbWVudDQ3ODU2MzM3NQ==,2448579,2019-04-01T12:43:03Z,2019-04-01T12:43:03Z,MEMBER,It sounds like there is an apply_ufunc solution to your problem but I dont know how to write it! ;),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,425320466 https://github.com/pydata/xarray/issues/2852#issuecomment-478415169,https://api.github.com/repos/pydata/xarray/issues/2852,478415169,MDEyOklzc3VlQ29tbWVudDQ3ODQxNTE2OQ==,1217238,2019-04-01T02:31:58Z,2019-04-01T02:31:58Z,MEMBER,"The current design of `GroupBy.apply()` in xarray is entirely ignorant of dask: it simply uses a `for` loop over the grouped variable to built up a computation with high level array operations. This makes operations that group over large keys stored in dask inefficient. This *could* be done efficiently (`dask.dataframe` does this, and might be worth trying in your case) but it's a more challenging distributed computing problem, and xarray's current data model would not know how large of a dimension to create for the returned ararys (doing this properly would require supporting arrays with unknown dimension sizes).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,425320466 https://github.com/pydata/xarray/issues/2852#issuecomment-476678007,https://api.github.com/repos/pydata/xarray/issues/2852,476678007,MDEyOklzc3VlQ29tbWVudDQ3NjY3ODAwNw==,1197350,2019-03-26T14:41:59Z,2019-03-26T14:41:59Z,MEMBER,"``` label (y, x) uint16 dask.array ... geoms_ds.groupby('label')` ``` It is very hard to make this sort of groupby lazy, because you are grouping over the variable `label` itself. Groupby uses a split-apply-combine paradigm to transform the data. The apply and combine steps can be lazy. But the split step cannot. Xarray uses the group variable to determine how to index the array, i.e. which items belong in which group. To do this, it needs to read the _whole variable_ into memory. In this specific example, it sounds like what you want is to compute the histogram of labels. That could be accomplished without groupby. For example, you could use apply_ufunc together with [`dask.array.histogram`](http://docs.dask.org/en/latest/array-api.html#dask.array.histogram). So my recommendation is to think of a way to accomplish what you want that does not involve groupby.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,425320466