home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 496809167 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • shoyer 2
  • dcherian 2
  • jbphyswx 1

author_association 2

  • MEMBER 4
  • NONE 1

issue 1

  • Memory usage of `da.rolling().construct` · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
779892262 https://github.com/pydata/xarray/issues/3332#issuecomment-779892262 https://api.github.com/repos/pydata/xarray/issues/3332 MDEyOklzc3VlQ29tbWVudDc3OTg5MjI2Mg== dcherian 2448579 2021-02-16T14:59:11Z 2021-02-16T14:59:11Z MEMBER

so it's pad then view, so a copy of the original array is made, not the strided array.

https://github.com/pydata/xarray/blob/735a3590ea4df70e1e5be729162df2f8774b3879/xarray/core/nputils.py#L149-L151

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Memory usage of `da.rolling().construct` 496809167
705098106 https://github.com/pydata/xarray/issues/3332#issuecomment-705098106 https://api.github.com/repos/pydata/xarray/issues/3332 MDEyOklzc3VlQ29tbWVudDcwNTA5ODEwNg== shoyer 1217238 2020-10-07T17:54:32Z 2020-10-07T17:54:32Z MEMBER

The loop via slicing is not a terrible option. The trick construct() uses with views only really makes sense with NumPy arrays, not with dask.

There are also true streaming moving window algorithms that work very well for computing various statistics (e.g., mean and variance). These are implemented in bottleneck (e.g., move_mean) and could be wrapped in xarray if desired for methods like rolling(...).mean(). These aren't implemented in dask yet, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Memory usage of `da.rolling().construct` 496809167
705068971 https://github.com/pydata/xarray/issues/3332#issuecomment-705068971 https://api.github.com/repos/pydata/xarray/issues/3332 MDEyOklzc3VlQ29tbWVudDcwNTA2ODk3MQ== jbphyswx 29147682 2020-10-07T17:00:35Z 2020-10-07T17:00:35Z NONE

Is there any way to get around this? The window dimension combined with the For window size x, every chunk should be larger than x//2 requirement means that for a large moving window I'm getting O(100GB) chunks that do not fit in memory at compute time. I can, of course, rechunk other dimensions, but that is expensive and substantially slower. I also suspect this becomes practically infeasible on machines that have little memory. Regardless, mandatory O(n^2) memory usage with window size seems less than ideal.

My workaround has been to just implement my own slicing via for loop and then call reduction operations on the resultant dask arrays as normal... Perhaps there is something I missed along the way but I couldn't find anything in open or past issues to aid in resolving this. Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Memory usage of `da.rolling().construct` 496809167
534709955 https://github.com/pydata/xarray/issues/3332#issuecomment-534709955 https://api.github.com/repos/pydata/xarray/issues/3332 MDEyOklzc3VlQ29tbWVudDUzNDcwOTk1NQ== shoyer 1217238 2019-09-24T19:21:22Z 2019-09-24T19:21:22Z MEMBER

It uses a view for allocating the initial result, but I think applying boundary conditions means that we end up doing a copy.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Memory usage of `da.rolling().construct` 496809167
533908429 https://github.com/pydata/xarray/issues/3332#issuecomment-533908429 https://api.github.com/repos/pydata/xarray/issues/3332 MDEyOklzc3VlQ29tbWVudDUzMzkwODQyOQ== dcherian 2448579 2019-09-22T19:02:07Z 2019-09-22T19:02:07Z MEMBER

It should be returning a view.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Memory usage of `da.rolling().construct` 496809167

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.081ms · About: xarray-datasette