home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 601612380

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/2922#issuecomment-601612380 https://api.github.com/repos/pydata/xarray/issues/2922 601612380 MDEyOklzc3VlQ29tbWVudDYwMTYxMjM4MA== 10194086 2020-03-20T09:45:23Z 2020-10-27T14:47:22Z MEMBER

tldr: if someone knows how to do memory profiling with reasonable effort this can still be changed

It's certainly not too late to change the "backend" of the weighting functions. I once tried to profile the memory usage but I gave up at some point (I think I would have needed to annotate a ton of functions, also in numpy).

@fujiisoup suggested using xr.dot(a, b) (instead of (a * b).sum()) to ameliorate part of the memory footprint. ~Which is done, however, this comes at a performance penalty. So an alternative code path might actually be beneficial.~ (edit: I now think xr.dot is actually faster, except for very small arrays).

Also mask is an array of dtype bool, which should help.

It think it should not be very difficult to write something that can be passed to apply_ufinc, probably similar to:

https://github.com/pydata/xarray/blob/e8a284f341645a63a4d83676a6b268394c721bbc/xarray/tests/test_weighted.py#L161

So there would be three possibilities: (1) the current implementation (using xr.dot(a, b)) (2) something similar to expected_weighted (using (a * b).sum()) (3) xr.apply_ufunc(a, b, expected_weighted, ...). I assume (2) is fastest with the largest memory footprint, but I cannot tell about (1) and (3).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  437765416
Powered by Datasette · Queries took 0.708ms · About: xarray-datasette