home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1044173017

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/5692#issuecomment-1044173017 https://api.github.com/repos/pydata/xarray/issues/5692 1044173017 IC_kwDOAMm_X84-PNDZ 4160723 2022-02-18T08:59:19Z 2022-02-18T08:59:19Z MEMBER

The benchmark triggers an error if it is at least 1.5x slower than main. If I remember correctly I believe unstacking.UnstackingSparse.time_unstack_to_sparse_2d has been around 1.45 in tests that have passed. So that one might be a little slower now but I wouldn't worry about that too much.

Ah now I see that it is likely because this branch makes a copy of the pandas index (+ a second one that is not necessary and that I'll fix). Although making a shallow copy of a pandas index is rather cheap, it significantly affects the benchmark results here because of the small size of the index. For larger index sizes it becomes irrelevant:

This branch:

  • size = 100

```python

%timeit da_eye_2d.unstack(sparse=True) 680 µs ± 3.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ```

  • size = 100 000

```python

%timeit da_eye_2d.unstack(sparse=True) 49.2 ms ± 457 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ```

Xarray stable 0.21.1 (with same versions of numpy, pandas and sparse):

  • size = 100

```python

%timeit da_eye_2d.unstack(sparse=True) 424 µs ± 13.1 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) ```

  • size = 100 000

```python

%timeit da_eye_2d.unstack(sparse=True) 49.4 ms ± 797 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) ```

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  966983801
Powered by Datasette · Queries took 0.663ms · About: xarray-datasette