issues: 711626733
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
711626733 | MDU6SXNzdWU3MTE2MjY3MzM= | 4473 | Wrap numpy-groupies to speed up Xarray's groupby aggregations | 1217238 | closed | 0 | 8 | 2020-09-30T04:43:04Z | 2022-05-15T02:38:29Z | 2022-05-15T02:38:29Z | MEMBER | Is your feature request related to a problem? Please describe. Xarray's groupby aggregations (e.g., Describe the solution you'd like We could speed things up considerably (easily 100x) by wrapping the numpy-groupies package. Additional context One challenge is how to handle dask arrays (and other duck arrays). In some cases it might make sense to apply the numpy-groupies function (using apply_ufunc), but in other cases it might be better to stick with the current indexing + concatenate solution. We could either pick some simple heuristics for choosing the algorithm to use on dask arrays, or could just stick with the current algorithm for now. In particular, it might make sense to stick with the current algorithm if there are a many chunks in the arrays to aggregated along the "grouped" dimension (depending on the size of the unique group values). |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4473/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |