id,node_id,number,state,locked,title,user,body,created_at,updated_at,closed_at,merged_at,merge_commit_sha,assignee,milestone,draft,head,base,author_association,auto_merge,repo,url,merged_by
514633891,MDExOlB1bGxSZXF1ZXN0NTE0NjMzODkx,4560,closed,0,Optimize slice_slice for faster isel of huge datasets,11994217,"I noticed that reading small slices of huge datasets (>1e8 rows) was very slow, even if they were properly chunked. I traced the issue back to `xarray.core.indexing.slice_slice`, which essentially calls `np.arange(ds_size)` to compute a slice. This is obviously `O(ds_size)`, even if the actual slice to be read is tiny.

You can see the issue in this gist:

https://gist.github.com/dionhaefner/a3e97bae0a4e28f0d39294074419a683

I took the liberty to optimize the function by computing the resulting slice arithmetically. With this in place, reading from disk is now the bottleneck as it should be. I saw performance increases by about a factor of 10, but this obviously varies with dimension size, slice size, and chunk size.

---

<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Passes `isort . && black . && mypy . && flake8`
",2020-11-03T10:26:38Z,2020-11-05T19:45:44Z,2020-11-05T19:07:24Z,2020-11-05T19:07:23Z,235b2e5bcec253ca6a85762323121d28c3b06038,,,0,86c56ca4b9e8d01136a7eed90160723e7535f0d2,83884a1c6dac4b5f6309dfea530414facc100bc8,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/4560,