home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 308923548

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/1457#issuecomment-308923548 https://api.github.com/repos/pydata/xarray/issues/1457 308923548 MDEyOklzc3VlQ29tbWVudDMwODkyMzU0OA== 12229877 2017-06-16T03:29:12Z 2017-06-16T03:29:12Z CONTRIBUTOR

I like the idea of benchmarks, but have some serious concerns. For Dask and IO-bound work in general, benchmark results will vary widely depending on the hardware and (if relevant) network properties. Results will be noncomparable between SSD and HDD, local and remote network access, and in general depend heavily on the specific IO patterns and storage/compute relationship of the computer.

This isn't a reason not to benchmark though, just a call for very cautious interpretation - it's clearly useful to catch some of the subtle-but-pathological performance problems that have cropped up. In short, I think benchmarks should have a very clear warnings section in the documentation, and no decision should be taken to change code without benchmarking on a variety of computers (SSD/HDD, PC/cluster, local/remote data...).

Also JSON cannot include comments, and there are a number of entries that you need to update, but that's a passing concern.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  236347050
Powered by Datasette · Queries took 0.617ms · About: xarray-datasette