home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 601496897

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/2922#issuecomment-601496897 https://api.github.com/repos/pydata/xarray/issues/2922 601496897 MDEyOklzc3VlQ29tbWVudDYwMTQ5Njg5Nw== 7441788 2020-03-20T02:11:53Z 2020-03-20T02:12:24Z CONTRIBUTOR

I realize this is a bit late, but I'm still concerned about memory usage, specifically in https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L130 and https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L143. If da.sizes = {'dim_0': 100000, 'dim_1': 100000}, the two lines above will cause da.weighted(weights).mean('dim_0') to create two simultaneous temporary 100000x100000 arrays, which could be problematic.

I would have implemented this using apply_ufunc, so that one creates these temporary variables only on as small an array as absolutely necessary -- in this case just of size sizes['dim_0'] = 100000. (Much as I would like to, I'm afraid I'm not able to contribute code.) Of course this won't help in the case one is summing over all dimensions, but might as well minimize memory usage in some cases even if not in all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  437765416
Powered by Datasette · Queries took 3.965ms · About: xarray-datasette