home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 168469112 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 2 ✖

issue 1

  • stack() on dask array produces inefficient chunking · 2 ✖

author_association 1

  • MEMBER 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
236737435 https://github.com/pydata/xarray/issues/926#issuecomment-236737435 https://api.github.com/repos/pydata/xarray/issues/926 MDEyOklzc3VlQ29tbWVudDIzNjczNzQzNQ== shoyer 1217238 2016-08-01T23:18:05Z 2016-08-01T23:18:18Z MEMBER

Given that we will need to be aware of the chunks layout in order to stack correctly by block, it might make sense to actually put all this logic in xarray itself, by using dask.array.map_blocks with np.reshape.

I'm not sure when I'll have the chance to work on this, so it might be a good relatively self-contained project for a new contributor.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack() on dask array produces inefficient chunking 168469112
236389759 https://github.com/pydata/xarray/issues/926#issuecomment-236389759 https://api.github.com/repos/pydata/xarray/issues/926 MDEyOklzc3VlQ29tbWVudDIzNjM4OTc1OQ== shoyer 1217238 2016-07-30T21:02:09Z 2016-07-30T21:02:09Z MEMBER

Yes, this is unfortunate, but dask.array can't efficiently flatten (in C order) along chunked dimensions (see https://github.com/dask/dask/pull/758 for discussion).

One thing we could do is add an option to dask to flatten arrays in "block contiguous" order. Then we could add an option (or maybe even change the default) to xarray' stack to do block contiguous stacking. If you're interested in working on this, the place to start is the code for dask.array's ravel operation (see the link above for details).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack() on dask array produces inefficient chunking 168469112

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 193.498ms · About: xarray-datasette