html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2799#issuecomment-1306327743,https://api.github.com/repos/pydata/xarray/issues/2799,1306327743,IC_kwDOAMm_X85N3Pq_,90008,2022-11-07T22:45:07Z,2022-11-07T22:45:07Z,CONTRIBUTOR,"As I've been recently going down this performance rabbit hole, I think the discussion around https://github.com/pydata/xarray/issues/7045 is relevant and provides some additional historical context as to ""why"" this performance penalty might be happening.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/2799#issuecomment-786813358,https://api.github.com/repos/pydata/xarray/issues/2799,786813358,MDEyOklzc3VlQ29tbWVudDc4NjgxMzM1OA==,90008,2021-02-26T18:19:28Z,2021-02-26T18:19:28Z,CONTRIBUTOR,"I hope the following can help users that struggle with the speed of xarray: I've found that when doing numerical computation, I often use the xarray to grab all the metadata relevant to my computation. Scale, chromaticity, experimental information. Eventually, i create a function that acts as a barrier: - Xarray input (high level experimental data) - Computation parameters output (low level implementation detail relevant information). The low level implementation can operate on the fast numpy arrays. I've found this to be the struggle with creating high level APIs that do things like sanitize inputs (xarray routines like `_validate_indexers` and `_broadcast_indexes`) and low level APIs that are simply interested in moving and computing data. For the example that @nbren12 brought up originally, it might be better to create xarray routines (if they don't exist already) that can create fast iterators for the underlying numpy arrays given a set of dimensions that the user cares about.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/2799#issuecomment-552652019,https://api.github.com/repos/pydata/xarray/issues/2799,552652019,MDEyOklzc3VlQ29tbWVudDU1MjY1MjAxOQ==,90008,2019-11-11T22:47:47Z,2019-11-11T22:47:47Z,CONTRIBUTOR,"Sure, I just wanted to make the note that this operation **should** be more or less constant time, as opposed to dependent on the size of the array. Somebody had mentionned it should increase with the size of the array. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458 https://github.com/pydata/xarray/issues/2799#issuecomment-552619589,https://api.github.com/repos/pydata/xarray/issues/2799,552619589,MDEyOklzc3VlQ29tbWVudDU1MjYxOTU4OQ==,90008,2019-11-11T21:16:36Z,2019-11-11T21:16:36Z,CONTRIBUTOR,"Hmm, slicing should basically be a no-op. The fact that xarray makes it about 100x slower is a real killer. It seems from this conversation that it might be hard to workaround
```python import xarray as xr import numpy as np n = np.zeros(shape=(1024, 1024)) x = xr.DataArray(n, dims=('y', 'x')) the_slice = np.s_[256:512, 256:512] %timeit n[the_slice] %timeit x[the_slice] 186 ns ± 0.778 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each) 70.3 µs ± 593 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each) ```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,416962458