issue_comments: 1423144049
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4610#issuecomment-1423144049 | https://api.github.com/repos/pydata/xarray/issues/4610 | 1423144049 | IC_kwDOAMm_X85U03Rx | 35968931 | 2023-02-08T19:40:58Z | 2023-02-08T20:25:04Z | MEMBER | Q: Use xhistogram approach or flox-powered approach?@dcherian recently showed how his flox package can perform histograms as groupby-like reductions. This begs the question of which approach would be better to use in a histogram function in xarray. (This is related to but better than what we had tried previously with xarray groupby and numpy_groupies.) Here's a WIP notebook comparing the two approaches. Both approaches can feasibly do: - Histograms which leave some dimensions excluded (broadcast over), - Multi-dimensional histograms (e.g. binning two different variables into one 2D bin), - Normalized histograms (return PDFs instead of counts), - Weighted histograms, - Multi-dimensional bins (as @aaronspring asks for above - but it requires work - see how to do it flox, and my stalled PR to xhistogram). Pros of using flox-powered reductions:
Pros of using xhistogram's blockwise bincount approach:
Other thoughts:
xref https://github.com/xgcm/xhistogram/issues/60, https://github.com/xgcm/xhistogram/issues/28 |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
750985364 |