home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "MEMBER" and issue = 172291585 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • shoyer 2
  • clarkfitzg 1

issue 1

  • align() should align chunks · 3 ✖

author_association 1

  • MEMBER · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
457281032 https://github.com/pydata/xarray/issues/979#issuecomment-457281032 https://api.github.com/repos/pydata/xarray/issues/979 MDEyOklzc3VlQ29tbWVudDQ1NzI4MTAzMg== shoyer 1217238 2019-01-24T17:19:30Z 2019-01-24T17:19:30Z MEMBER

I think dask.array handles this differing chunk sizes better these days, so perhaps this is no longer necessary.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  align() should align chunks 172291585
264510264 https://github.com/pydata/xarray/issues/979#issuecomment-264510264 https://api.github.com/repos/pydata/xarray/issues/979 MDEyOklzc3VlQ29tbWVudDI2NDUxMDI2NA== clarkfitzg 5356122 2016-12-02T17:23:46Z 2016-12-02T17:23:46Z MEMBER

As an end user, it would be really nice to not have to worry about chunks at all. I'd like to write the same code in xarray using Numpy and have it do the right thing in dask transparently.

It seems like dask is moving in this direction (see Automatic blocksize for read_csv dask/dask#1147).

Agree with @shoyer that these features belong in dask.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  align() should align chunks 172291585
241232491 https://github.com/pydata/xarray/issues/979#issuecomment-241232491 https://api.github.com/repos/pydata/xarray/issues/979 MDEyOklzc3VlQ29tbWVudDI0MTIzMjQ5MQ== shoyer 1217238 2016-08-21T00:55:11Z 2016-08-21T00:55:11Z MEMBER

I agree that it would make sense for xarray.align to unify chunks in dask arrays, but the documentation is actually a little out of date here: dask.array does now do some minimal automatic rechunking (see unify_chunks for details). Also, dask array functions, at least those that use elemwise, do automatically coerce NumPy arrays into dask arrays. So adding a tiny numpy array to a huge dask array does currently do the right thing.

As you can see, the automatic rechunking algorithm that dask.array currently uses is super simple: it only reconciles chunks when one array is unchunked. I'm certainly open to more sophisticated options for automatic rechunking (see https://github.com/dask/dask/issues/111), but either way I'd prefer to keep as much of this logic on the dask side as possible. Ideally, we'd simply call dask.array.unify_chunks passing in the named dimensions for each array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  align() should align chunks 172291585

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.735ms · About: xarray-datasette