home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 760532153

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/4746#issuecomment-760532153 https://api.github.com/repos/pydata/xarray/issues/4746 760532153 MDEyOklzc3VlQ29tbWVudDc2MDUzMjE1Mw== 5635139 2021-01-14T23:05:16Z 2021-01-14T23:05:16Z MEMBER

I double-checked the benchmarks and added a pandas comparison. That involved ensuring the missing value was handled corre them and ensured the setup wasn't in the benchmark.

I don't get the 100x speed-up that I thought I saw initially; it's now more like 8x. Still decent! I'm not sure whether that's because I misread the benchmark previously or because the benchmarks are slightly different — I guess the first.

Pasting below the results so we have something concrete.

Existing asv profile unstacking.Unstacking.time_unstack_slow master | head -n 20 ··· unstacking.Unstacking.time_unstack_slow 861±20ms

Proposed asv profile unstacking.Unstacking.time_unstack_slow HEAD | head -n 20 ··· unstacking.Unstacking.time_unstack_slow 108±3ms

Pandas asv profile unstacking.Unstacking.time_unstack_pandas_slow master | head -n 20 ··· unstacking.Unstacking.time_unstack_pandas_slow 207±10ms

Are we OK with the claim vs pandas? I think it's important that we make accurate comparisons (both good and bad) but open-minded if it seems a bit aggressive. Worth someone reviewing the code in the benchmark to ensure I haven't made a mistake.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  777153550
Powered by Datasette · Queries took 2.294ms · About: xarray-datasette