html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/5734#issuecomment-1126852038,https://api.github.com/repos/pydata/xarray/issues/5734,1126852038,IC_kwDOAMm_X85DKmXG,2448579,2022-05-15T03:31:50Z,2022-05-15T03:31:50Z,MEMBER,and @andersy005 !,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-1125236627,https://api.github.com/repos/pydata/xarray/issues/5734,1125236627,IC_kwDOAMm_X85DEb-T,2448579,2022-05-12T17:14:13Z,2022-05-12T17:14:13Z,MEMBER,"> a global/context option that changes the default value of method
Unfortunately the optimal method depends on distribution of group labels across chunks, so a global option doesn't make sense. It would make sense to create a `method=""auto""` and use that but it doesn't exist yet (""cohorts"" is closest)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-1124194834,https://api.github.com/repos/pydata/xarray/issues/5734,1124194834,IC_kwDOAMm_X85DAdoS,2448579,2022-05-11T19:14:24Z,2022-05-11T19:15:44Z,MEMBER,"Thanks for testing it out! I was going to ping xclim when this finally got merged. Presumably you haven't found any bugs?
---
You can pass `method` as `.mean(..., method=...)`. Clearly this needs docs :)
We could actually consider adding `flox_kwargs` to the groupby constructor since a method is really only dependent on the distribution of group labels across the chunks. Right now, I'd just like this to get merged :)
For resampling-type, we are using cohorts by default which generalizes to blockwise when applicable but is slower at graph-construction time. Note you can only blockwise if all members of a group are in a single block. So if you are resampling to yearly but a year of data occupies multiple chunks, you want ""cohorts"", not ""blockwise"".","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-1117613319,https://api.github.com/repos/pydata/xarray/issues/5734,1117613319,IC_kwDOAMm_X85CnW0H,2448579,2022-05-04T17:26:08Z,2022-05-04T17:26:08Z,MEMBER,Thanks @Illviljan I'm having trouble getting the inheritance order right and keeping mypy happy. Help is very welcome!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-1117497457,https://api.github.com/repos/pydata/xarray/issues/5734,1117497457,IC_kwDOAMm_X85Cm6hx,2448579,2022-05-04T15:33:30Z,2022-05-04T15:34:06Z,MEMBER,"@pydata/xarray This is ready to go. It's mostly one adaptor function and a lot of new tests. It does need docs, I can add that in a future PR.
By default, we use a strategy (""split-reduce"") that is very similar to our current one with dask arrays, so users will have to [explicitly choose a new strategy](https://flox.readthedocs.io/en/latest/implementation.html) to see much improvements. For resampling we can choose a sensible default that should show only improvements, and no regressions (""cohorts"")","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-1092097037,https://api.github.com/repos/pydata/xarray/issues/5734,1092097037,IC_kwDOAMm_X85BGBQN,2448579,2022-04-07T19:00:28Z,2022-04-07T19:00:28Z,MEMBER,@pydata/xarray this is blocked by https://github.com/pydata/xarray/issues/6430 but is ready for review.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-966624963,https://api.github.com/repos/pydata/xarray/issues/5734,966624963,IC_kwDOAMm_X845nYbD,2448579,2021-11-11T21:05:54Z,2021-11-11T21:05:54Z,MEMBER,This builds on #5950 so that should be reviewed and merged first.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-933160264,https://api.github.com/repos/pydata/xarray/issues/5734,933160264,IC_kwDOAMm_X843nuVI,2448579,2021-10-04T05:42:51Z,2021-11-11T20:58:53Z,MEMBER,"!!!
The only failures are in `test_units.py` so now I think we can figure out how to implement this cleanly.
```
FAILED xarray/tests/test_units.py::TestDataArray::test_computation_objects[float64-method_groupby-data]
FAILED xarray/tests/test_units.py::TestDataArray::test_computation_objects[float64-method_groupby_bins-data]
FAILED xarray/tests/test_units.py::TestDataArray::test_computation_objects[int64-method_groupby-data]
FAILED xarray/tests/test_units.py::TestDataArray::test_computation_objects[int64-method_groupby_bins-data]
FAILED xarray/tests/test_units.py::TestDataArray::test_resample[float64] - pi...
FAILED xarray/tests/test_units.py::TestDataArray::test_resample[int64] - pint...
FAILED xarray/tests/test_units.py::TestDataset::test_computation_objects[float64-data-method_groupby_bins]
FAILED xarray/tests/test_units.py::TestDataset::test_computation_objects[int64-data-method_groupby_bins]
FAILED xarray/tests/test_units.py::TestDataset::test_resample[float64-data]
FAILED xarray/tests/test_units.py::TestDataset::test_resample[int64-data] - p...
```
I like @max-sixty's suggestion of generating the reductions like `generate_ops.py`. It seems like a good first step would be to refactor the existing reductions in a separate PR.
","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-965574480,https://api.github.com/repos/pydata/xarray/issues/5734,965574480,IC_kwDOAMm_X845jX9Q,2448579,2021-11-10T17:31:22Z,2021-11-10T21:52:17Z,MEMBER,OK CI isn't using the numpy_groupies code path for reasons I don't understand. Does anyone see a reason why this might happen?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-964409224,https://api.github.com/repos/pydata/xarray/issues/5734,964409224,IC_kwDOAMm_X845e7eI,2448579,2021-11-09T18:14:38Z,2021-11-09T18:14:38Z,MEMBER,"Benchmarks are looking good (npg=True means use numpy groupies). Big gains (10-20x) for large number of groups (100), especially with dask.
```
[ 2.78%] ··· groupby.GroupBy.time_agg_large_num_groups ok
[ 2.78%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 8.38±0ms 101±0ms 9.54±0ms 136±0ms
mean 7.12±0ms 101±0ms 9.74±0ms 148±0ms
======== ========== =========== ========== ===========
[ 5.56%] ··· groupby.GroupBy.time_agg_small_num_groups ok
[ 5.56%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 8.27±0ms 4.55±0ms 9.07±0ms 8.46±0ms
mean 7.19±0ms 4.50±0ms 9.24±0ms 8.36±0ms
======== ========== =========== ========== ===========
[ 8.33%] ··· groupby.GroupBy.time_init ok
[ 8.33%] ··· ====== ==========
ndim
------ ----------
1 1.72±0ms
2 4.06±0ms
====== ==========
[ 11.11%] ··· groupby.GroupByDask.time_agg_large_num_groups ok
[ 11.11%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 8.41±0ms 202±0ms 9.93±0ms 226±0ms
mean 7.83±0ms 197±0ms 10.7±0ms 213±0ms
======== ========== =========== ========== ===========
[ 13.89%] ··· groupby.GroupByDask.time_agg_small_num_groups ok
[ 13.89%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 8.41±0ms 8.99±0ms 10.5±0ms 12.5±0ms
mean 7.98±0ms 8.67±0ms 10.1±0ms 12.2±0ms
======== ========== =========== ========== ===========
[ 16.67%] ··· groupby.GroupByDask.time_init ok
[ 16.67%] ··· ====== ==========
ndim
------ ----------
1 1.77±0ms
2 4.06±0ms
====== ==========
```
```
[ 36.11%] ··· groupby.Resample.time_agg_large_num_groups ok
[ 36.11%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 17.2±0ms 83.3±0ms 17.0±0ms 93.5±0ms
mean 15.5±0ms 91.0±0ms 17.4±0ms 101±0ms
======== ========== =========== ========== ===========
[ 38.89%] ··· groupby.Resample.time_agg_small_num_groups ok
[ 38.89%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 16.7±0ms 12.3±0ms 16.7±0ms 13.3±0ms
mean 15.2±0ms 12.5±0ms 19.3±0ms 13.9±0ms
======== ========== =========== ========== ===========
[ 41.67%] ··· groupby.Resample.time_init ok
[ 41.67%] ··· ====== ==========
ndim
------ ----------
1 7.46±0ms
2 7.26±0ms
====== ==========
[ 44.44%] ··· groupby.ResampleDask.time_agg_large_num_groups ok
[ 44.44%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 22.3±0ms 561±0ms 28.3±0ms 607±0ms
mean 22.2±0ms 344±0ms 27.3±0ms 371±0ms
======== ========== =========== ========== ===========
[ 47.22%] ··· groupby.ResampleDask.time_agg_small_num_groups ok
[ 47.22%] ··· ======== ========== =========== ========== ===========
-- ndim / npg
-------- ---------------------------------------------
method 1 / True 1 / False 2 / True 2 / False
======== ========== =========== ========== ===========
sum 17.7±0ms 31.2±0ms 20.0±0ms 34.2±0ms
mean 17.2±0ms 24.4±0ms 19.9±0ms 26.6±0ms
======== ========== =========== ========== ===========
[ 50.00%] ··· groupby.ResampleDask.time_init ok
[ 50.00%] ··· ====== ==========
ndim
------ ----------
1 7.43±0ms
2 6.91±0ms
==
```","{""total_count"": 3, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 3, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-963568052,https://api.github.com/repos/pydata/xarray/issues/5734,963568052,IC_kwDOAMm_X845buG0,2448579,2021-11-08T21:00:04Z,2021-11-08T21:00:04Z,MEMBER,"> Maybe it's also on the pint side? Even if numpy_groupies supports the like argument it will crash because pint doesn't support asanyarray.
cc @keewis ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-954290212,https://api.github.com/repos/pydata/xarray/issues/5734,954290212,IC_kwDOAMm_X8444VAk,2448579,2021-10-28T23:11:08Z,2021-10-28T23:11:08Z,MEMBER,"> appears numpy_groupies is forcing the duck arrays to numpy arrays.
yes; this will require upstream changes","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586
https://github.com/pydata/xarray/pull/5734#issuecomment-913070347,https://api.github.com/repos/pydata/xarray/issues/5734,913070347,IC_kwDOAMm_X842bFkL,2448579,2021-09-05T01:52:49Z,2021-09-05T01:52:49Z,MEMBER,We don't have any asv benchmarks for groupby currently. It would be good to add some! ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,978356586