issues: 276688437
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
276688437 | MDU6SXNzdWUyNzY2ODg0Mzc= | 1742 | Performance regression when selecting | 3621629 | closed | 0 | 1 | 2017-11-24T19:34:29Z | 2019-06-06T19:08:06Z | 2019-06-06T19:08:06Z | CONTRIBUTOR | Hello, I just noticed a performance drop in 0.10 after a ```python import numpy as np import pandas as pd import xarray as xr np.random.seed(1234) ds = xr.Dataset({k: pd.DataFrame(np.random.randn(2500, 2000)) for k in range(20)}) mask = (np.random.randn(2000) > -0.2).astype(bool) %timeit ds.sel(dim_0=slice(50, 1250), dim_1=mask) %timeit ds[0].sel(dim_0=slice(50, 1250), dim_1=mask) xarray 0.9.6 -> 120 ± 0.4 ms, 4.2 ± 0.02 msxarray 0.10 -> 190 ± 0.4 ms, 6.8 ± 0.03 ms``` This was run in a docker image. Strangely I can't reproduce it natively on macos (performance is the same as in 0.10 in docker for both versions). On a window box, with similar but "real" netcdf dataset performance is halved. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1742/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |