id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
735199603,MDExOlB1bGxSZXF1ZXN0NTE0NjMzODkx,4560,Optimize slice_slice for faster isel of huge datasets,11994217,closed,0,,,5,2020-11-03T10:26:38Z,2020-11-05T19:45:44Z,2020-11-05T19:07:24Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4560,"I noticed that reading small slices of huge datasets (>1e8 rows) was very slow, even if they were properly chunked. I traced the issue back to `xarray.core.indexing.slice_slice`, which essentially calls `np.arange(ds_size)` to compute a slice. This is obviously `O(ds_size)`, even if the actual slice to be read is tiny.

You can see the issue in this gist:

https://gist.github.com/dionhaefner/a3e97bae0a4e28f0d39294074419a683

I took the liberty to optimize the function by computing the resulting slice arithmetically. With this in place, reading from disk is now the bottleneck as it should be. I saw performance increases by about a factor of 10, but this obviously varies with dimension size, slice size, and chunk size.

---

<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Passes `isort . && black . && mypy . && flake8`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4560/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull