home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1295939038

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1295939038 I_kwDOAMm_X85NPnXe 6758 simple groupby_bins 10x slower than numpy 731499 closed 0     8 2022-07-06T14:36:26Z 2022-07-07T08:26:26Z 2022-07-06T17:24:27Z CONTRIBUTOR      

I am finding that groupby_bins is 10x slower than numpy in what I consider to be a simple implementation.

In the screenshot below, you can see me opening a netCDF file containing two variables with the same single dimension. One variable is the latitude. I want to aggregate (sum) the other variable in bins of latitude. The xarray approach using groupby_bins takes ~314ms per loop, the numpy approach less than 30ms per loop.

I need to do this kind of computation on many more variables, on data spanning several years, and following the xarray approach leads to many more hours of processing :-/

Am I doing something wrong here?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6758/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 8 rows from issue in issue_comments
Powered by Datasette · Queries took 0.616ms · About: xarray-datasette