home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 276688437

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
276688437 MDU6SXNzdWUyNzY2ODg0Mzc= 1742 Performance regression when selecting 3621629 closed 0     1 2017-11-24T19:34:29Z 2019-06-06T19:08:06Z 2019-06-06T19:08:06Z CONTRIBUTOR      

Hello,

I just noticed a performance drop in 0.10 after a conda update xarray

```python import numpy as np import pandas as pd import xarray as xr

np.random.seed(1234) ds = xr.Dataset({k: pd.DataFrame(np.random.randn(2500, 2000)) for k in range(20)}) mask = (np.random.randn(2000) > -0.2).astype(bool)

%timeit ds.sel(dim_0=slice(50, 1250), dim_1=mask) %timeit ds[0].sel(dim_0=slice(50, 1250), dim_1=mask)

xarray 0.9.6 -> 120 ± 0.4 ms, 4.2 ± 0.02 ms

xarray 0.10 -> 190 ± 0.4 ms, 6.8 ± 0.03 ms

```

This was run in a docker image. Strangely I can't reproduce it natively on macos (performance is the same as in 0.10 in docker for both versions). On a window box, with similar but "real" netcdf dataset performance is halved.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1742/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 1 row from issue in issue_comments
Powered by Datasette · Queries took 0.482ms · About: xarray-datasette