home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 202423683 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

These facets timed out: author_association, issue

user 1

  • shoyer · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
274380660 https://github.com/pydata/xarray/issues/1224#issuecomment-274380660 https://api.github.com/repos/pydata/xarray/issues/1224 MDEyOklzc3VlQ29tbWVudDI3NDM4MDY2MA== shoyer 1217238 2017-01-23T02:04:23Z 2017-01-23T02:04:23Z MEMBER

Was concat slow at graph construction or compute time? On Sun, Jan 22, 2017 at 6:02 PM crusaderky notifications@github.com wrote:

(arrays * weights).sum('stacked') was my first attempt. It performed considerably worse than sum(a * w for a, w in zip(arrays, weights)) - mostly because xarray.concat() is not terribly performant (I did not look deeper into it).

I did not try dask.array.sum() - worth some playing with.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1224#issuecomment-274380448, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1oZrmY8hgglb3RBTcDcFhcLhs8Lbks5rVAoggaJpZM4LqkHo .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fast weighted sum 202423683
274375810 https://github.com/pydata/xarray/issues/1224#issuecomment-274375810 https://api.github.com/repos/pydata/xarray/issues/1224 MDEyOklzc3VlQ29tbWVudDI3NDM3NTgxMA== shoyer 1217238 2017-01-23T01:09:49Z 2017-01-23T01:09:49Z MEMBER

Interesting -- thanks for sharing! I am interested in performance improvements but also a little reluctant to add in specialized optimizations directly into xarray.

You write that this is equivalent to sum(a * w for a, w in zip(arrays, weights)). How does this compare to stacking doing the sum in xarray, e.g.,(arrays * weights).sum('stacked'), where arrays and weights are now DataArray objects with a 'stacked' dimension? Or maybe arrays.dot(weights)?

Using vectorized operations feels a bit more idiomatic (though also maybe more verbose). It also may be more performant. Note that the builtin sum is not optimized well by dask because it's basically equivalent to a loop: def sum(xs): result = 0 for x in xs: result += x return result In contrast, dask.array.sum builds up a tree so it can do the sum in parallel.

There have also been discussion in https://github.com/pydata/xarray/issues/422 about adding a dedicated method for weighted mean.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  fast weighted sum 202423683

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 4402.951ms · About: xarray-datasette