home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 469861382

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2799#issuecomment-469861382 https://api.github.com/repos/pydata/xarray/issues/2799 469861382 MDEyOklzc3VlQ29tbWVudDQ2OTg2MTM4Mg== 5635139 2019-03-05T21:19:31Z 2019-03-05T21:19:31Z MEMBER

To put the relative speed of numpy access into perspective, I found this insightful: https://jakevdp.github.io/blog/2012/08/08/memoryview-benchmarks/ (it's now a few years out of date, but I think the fundamentals still stand)

Pasted from there:

Summary Here are the timing results we've seen above:

Python + numpy: 6510 ms Cython + numpy: 668 ms Cython + memviews (slicing): 22 ms Cython + raw pointers: 2.47 ms Cython + memviews (no slicing): 2.45 ms

So if we're running an inner loop on an array, accessing it using numpy in python is an order of magnitude slower than accessing it using numpy in C (and that's an order of magnitude slower than using a slice, and that's an order of magnitude slower than using raw pointers)

So - let's definitely speed xarray up (your benchmarks are excellent, thank you again, and I think you're right there are opportunities for significant increases). But where speed is paramount above all else, we shouldn't use any access in python, let alone the niceties of xarray access.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  416962458
Powered by Datasette · Queries took 0.717ms · About: xarray-datasette