home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "CONTRIBUTOR", issue = 236347050 and user = 12229877 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • Zac-HD · 2 ✖

issue 1

  • Feature/benchmark · 2 ✖

author_association 1

  • CONTRIBUTOR · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
308926818 https://github.com/pydata/xarray/pull/1457#issuecomment-308926818 https://api.github.com/repos/pydata/xarray/issues/1457 MDEyOklzc3VlQ29tbWVudDMwODkyNjgxOA== Zac-HD 12229877 2017-06-16T03:57:57Z 2017-06-16T03:57:57Z CONTRIBUTOR

The tests for Hypothesis take almost twice as long to run on Travis at certain times of day, so I certainly wouldn't use it for benchmarking anything!

Also concerned that a dedicated benchmarking machine may lead to software (accidentally!) optimized for a particular architecture or balance of machine resources without due consideration. Maybe @wesm could investigate fault injection to (eg) slow down disk access or add latency for some sets of benchmarks?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/benchmark 236347050
308923548 https://github.com/pydata/xarray/pull/1457#issuecomment-308923548 https://api.github.com/repos/pydata/xarray/issues/1457 MDEyOklzc3VlQ29tbWVudDMwODkyMzU0OA== Zac-HD 12229877 2017-06-16T03:29:12Z 2017-06-16T03:29:12Z CONTRIBUTOR

I like the idea of benchmarks, but have some serious concerns. For Dask and IO-bound work in general, benchmark results will vary widely depending on the hardware and (if relevant) network properties. Results will be noncomparable between SSD and HDD, local and remote network access, and in general depend heavily on the specific IO patterns and storage/compute relationship of the computer.

This isn't a reason not to benchmark though, just a call for very cautious interpretation - it's clearly useful to catch some of the subtle-but-pathological performance problems that have cropped up. In short, I think benchmarks should have a very clear warnings section in the documentation, and no decision should be taken to change code without benchmarking on a variety of computers (SSD/HDD, PC/cluster, local/remote data...).

Also JSON cannot include comments, and there are a number of entries that you need to update, but that's a passing concern.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature/benchmark 236347050

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.019ms · About: xarray-datasette