home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 202423683

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
202423683 MDU6SXNzdWUyMDI0MjM2ODM= 1224 fast weighted sum 6213168 closed 0     5 2017-01-23T00:29:19Z 2019-08-09T08:36:11Z 2019-08-09T08:36:11Z MEMBER      

In my project I'm struggling with weighted sums of 2000-4000 dask-based xarrays. The time to reach the final dask-based array, the size of the final dask dict, and the time to compute the actual result are horrendous.

So I wrote the below which - as laborious as it may look - gives a performance boost nothing short of miraculous. At the bottom you'll find some benchmarks as well.

https://gist.github.com/crusaderky/62832a5ffc72ccb3e0954021b0996fdf

In my project, this deflated the size of the final dask dict from 5.2 million keys to 3.3 million and cut a 30% from the time required to define it.

I think it's generic enough to be a good addition to the core xarray module. Impressions?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1224/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 5 rows from issue in issue_comments
Powered by Datasette · Queries took 0.573ms · About: xarray-datasette