home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

1 row where issue = 276688437 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer 1

issue 1

  • Performance regression when selecting · 1 ✖

author_association 1

  • MEMBER 1
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
346912338 https://github.com/pydata/xarray/issues/1742#issuecomment-346912338 https://api.github.com/repos/pydata/xarray/issues/1742 MDEyOklzc3VlQ29tbWVudDM0NjkxMjMzOA== shoyer 1217238 2017-11-25T01:49:12Z 2017-11-25T01:49:12Z MEMBER

I also notice a roughly 50% slow-down for this indexing behavior in 0.10 relative to 0.9.6 on my laptop. For the smaller loop: ```

xarray 0.9.6 -> 7.79 ms ± 145 µs

xarray 0.10 -> 11.6 ms ± 102 µs

```

The difference appears to be that in 0.9.6, we index a numpy array with two integer numpy arrays, whereas in 0.10 we do indexing with a slice object and a numpy array. In general, we would expect using the slice to be faster, but for whatever reason, using the slice is slower here. Here is the same behavior illustrated with just NumPy: ``` In [7]: x = np.random.randn(2500, 2000)

In [8]: y = (np.random.randn(2000) > -0.2).astype(bool)

In [12]: %timeit x[50:1250, np.flatnonzero(y)] 10.6 ms ± 73.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [13]: %timeit x[np.arange(50, 1250)[:, np.newaxis], np.flatnonzero(y)[np.newaxis, :]] 7.17 ms ± 42.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` So potentially it's worth trying to get this optimization upstream into NumPy.

We could potentially switch the internal implementation of xarray back, but I'm not sure it would be a clear win. For example, we see an opposite change in performance for indexing on the transposed arrays: ```

xarray 0.9.6

In [3]: %timeit ds[0].T.sel(dim_0=slice(50, 1250), dim_1=mask) 16.1 ms ± 600 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

xarray 0.10

In [3]: %timeit ds[0].T.sel(dim_0=slice(50, 1250), dim_1=mask) 12.3 ms ± 459 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Performance regression when selecting 276688437

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.443ms · About: xarray-datasette