home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

1 row where author_association = "MEMBER", issue = 397063221 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 1 ✖

issue 1

  • open_mfdataset in v.0.11.1 is very slow · 1 ✖

author_association 1

  • MEMBER · 1 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
454351420 https://github.com/pydata/xarray/issues/2662#issuecomment-454351420 https://api.github.com/repos/pydata/xarray/issues/2662 MDEyOklzc3VlQ29tbWVudDQ1NDM1MTQyMA== shoyer 1217238 2019-01-15T10:56:03Z 2019-01-15T10:56:03Z MEMBER

@malmans2 thanks for this reproducible test case!

From xarray's perspective, the difference is the order in which the arrays are concatenated/processed. This is determined by sorting the (globbed) file names: ``` In [16]: sorted(glob.glob('rep/.nc')) Out[16]: ['rep0/dsA0.nc', 'rep0/dsB0.nc', 'rep1/dsA1.nc', 'rep1/dsB1.nc']

In [17]: sorted(glob.glob('*.nc')) Out[17]: ['dsA0.nc', 'dsA1.nc', 'dsB0.nc', 'dsB1.nc'] ```

It appears that the slow case [A0, B0, A1, B1] now requires computing data with dask, whereas [A0, A1, B0, B1] does not.

I suspect the issue is that we're now using some different combination of merge/concat. In particular it looks like the compute is being triggered from within merge. This sort of makes sense: if we're using merge instead of concat for joining along the dimension T, that is super slow because that goes through a path that checks arrays for conflicting values by loading data into memory (even though in this case that isn't possible, because the original coordinates were not overlapping).

We could (and should) optimize this path in merge to avoid eagerly loading data, but the immediate fix here is probably to make sure we're using concat instead of merge.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  open_mfdataset in v.0.11.1 is very slow 397063221

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 229.311ms · About: xarray-datasette