issue_comments: 346912338
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/1742#issuecomment-346912338 | https://api.github.com/repos/pydata/xarray/issues/1742 | 346912338 | MDEyOklzc3VlQ29tbWVudDM0NjkxMjMzOA== | 1217238 | 2017-11-25T01:49:12Z | 2017-11-25T01:49:12Z | MEMBER | I also notice a roughly 50% slow-down for this indexing behavior in 0.10 relative to 0.9.6 on my laptop. For the smaller loop: ``` xarray 0.9.6 -> 7.79 ms ± 145 µsxarray 0.10 -> 11.6 ms ± 102 µs``` The difference appears to be that in 0.9.6, we index a numpy array with two integer numpy arrays, whereas in 0.10 we do indexing with a In [8]: y = (np.random.randn(2000) > -0.2).astype(bool) In [12]: %timeit x[50:1250, np.flatnonzero(y)] 10.6 ms ± 73.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [13]: %timeit x[np.arange(50, 1250)[:, np.newaxis], np.flatnonzero(y)[np.newaxis, :]] 7.17 ms ± 42.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` So potentially it's worth trying to get this optimization upstream into NumPy. We could potentially switch the internal implementation of xarray back, but I'm not sure it would be a clear win. For example, we see an opposite change in performance for indexing on the transposed arrays: ``` xarray 0.9.6In [3]: %timeit ds[0].T.sel(dim_0=slice(50, 1250), dim_1=mask) 16.1 ms ± 600 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) xarray 0.10In [3]: %timeit ds[0].T.sel(dim_0=slice(50, 1250), dim_1=mask) 12.3 ms ± 459 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
276688437 |