issue_comments: 1425198851
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4610#issuecomment-1425198851 | https://api.github.com/repos/pydata/xarray/issues/4610 | 1425198851 | IC_kwDOAMm_X85U8s8D | 2448579 | 2023-02-10T05:38:13Z | 2023-02-10T05:38:31Z | MEMBER |
Nah, in my experience, the overhead is "factorizing" (pd.cut/np.digitize) or converting to integer bins, and then converting the nD problem to a 1D problem for bincount. numba doesn't really help. 3-4x is a lot bigger than I expected. I was hoping for under 2x because flox is more general. I think the problem is We could swap that out easily here: https://github.com/xarray-contrib/flox/blob/daebc868c13dad74a55d74f3e5d24e0f6bbbc118/flox/core.py#L473 I think the one special case to consider is binning datetimes, and that digitize and pd.cut have different defaults for
Ideally As a workaround, we could replace
Yup. unlikely to help here. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
750985364 |