issue_comments
12 rows where author_association = "CONTRIBUTOR" and issue = 416962458 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- Performance: numpy indexes small amounts of data 1000 faster than xarray · 12 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1306327743 | https://github.com/pydata/xarray/issues/2799#issuecomment-1306327743 | https://api.github.com/repos/pydata/xarray/issues/2799 | IC_kwDOAMm_X85N3Pq_ | hmaarrfk 90008 | 2022-11-07T22:45:07Z | 2022-11-07T22:45:07Z | CONTRIBUTOR | As I've been recently going down this performance rabbit hole, I think the discussion around https://github.com/pydata/xarray/issues/7045 is relevant and provides some additional historical context as to "why" this performance penalty might be happening. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
786813358 | https://github.com/pydata/xarray/issues/2799#issuecomment-786813358 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDc4NjgxMzM1OA== | hmaarrfk 90008 | 2021-02-26T18:19:28Z | 2021-02-26T18:19:28Z | CONTRIBUTOR | I hope the following can help users that struggle with the speed of xarray: I've found that when doing numerical computation, I often use the xarray to grab all the metadata relevant to my computation. Scale, chromaticity, experimental information. Eventually, i create a function that acts as a barrier: - Xarray input (high level experimental data) - Computation parameters output (low level implementation detail relevant information). The low level implementation can operate on the fast numpy arrays. I've found this to be the struggle with creating high level APIs that do things like sanitize inputs (xarray routines like For the example that @nbren12 brought up originally, it might be better to create xarray routines (if they don't exist already) that can create fast iterators for the underlying numpy arrays given a set of dimensions that the user cares about. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
786764651 | https://github.com/pydata/xarray/issues/2799#issuecomment-786764651 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDc4Njc2NDY1MQ== | nbren12 1386642 | 2021-02-26T16:51:50Z | 2021-02-26T16:51:50Z | CONTRIBUTOR | @jhamman Weren't you talking about an xarray lite (TM) package? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
553294966 | https://github.com/pydata/xarray/issues/2799#issuecomment-553294966 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDU1MzI5NDk2Ng== | nbren12 1386642 | 2019-11-13T08:32:05Z | 2019-11-13T08:32:16Z | CONTRIBUTOR | This |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
552652019 | https://github.com/pydata/xarray/issues/2799#issuecomment-552652019 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDU1MjY1MjAxOQ== | hmaarrfk 90008 | 2019-11-11T22:47:47Z | 2019-11-11T22:47:47Z | CONTRIBUTOR | Sure, I just wanted to make the note that this operation should be more or less constant time, as opposed to dependent on the size of the array. Somebody had mentionned it should increase with the size of the array. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
552619589 | https://github.com/pydata/xarray/issues/2799#issuecomment-552619589 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDU1MjYxOTU4OQ== | hmaarrfk 90008 | 2019-11-11T21:16:36Z | 2019-11-11T21:16:36Z | CONTRIBUTOR | Hmm, slicing should basically be a no-op. The fact that xarray makes it about 100x slower is a real killer. It seems from this conversation that it might be hard to workaround
```python
import xarray as xr
import numpy as np
n = np.zeros(shape=(1024, 1024))
x = xr.DataArray(n, dims=('y', 'x'))
the_slice = np.s_[256:512, 256:512]
%timeit n[the_slice]
%timeit x[the_slice]
186 ns ± 0.778 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
70.3 µs ± 593 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
539352070 | https://github.com/pydata/xarray/issues/2799#issuecomment-539352070 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDUzOTM1MjA3MA== | ashwinvis 9155111 | 2019-10-08T06:08:27Z | 2019-10-08T06:08:48Z | CONTRIBUTOR | I suspect system jitter in the profiling as the time for |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
538366978 | https://github.com/pydata/xarray/issues/2799#issuecomment-538366978 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDUzODM2Njk3OA== | ashwinvis 9155111 | 2019-10-04T11:57:10Z | 2019-10-04T11:57:10Z | CONTRIBUTOR |
Not really. Pythran always releases the GIL and does a bunch of optimizations between transpilation and compilations. A good approach would be try out different compilers and see what performance is obtained, without losing readability (https://github.com/pydata/xarray/issues/2799#issuecomment-469444519). See scikit-image/scikit-image/issues/4199 where the package |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
469451210 | https://github.com/pydata/xarray/issues/2799#issuecomment-469451210 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDQ2OTQ1MTIxMA== | nbren12 1386642 | 2019-03-04T22:40:07Z | 2019-03-04T22:40:07Z | CONTRIBUTOR | Sure, I've been using that as a workaround as well. Unfortunately, that approach throws away all the nice info (e.g. metadata, coordinate) that xarray objects have and requires duplicating much of xarray's indexing logic. |
{ "total_count": 3, "+1": 3, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
469447632 | https://github.com/pydata/xarray/issues/2799#issuecomment-469447632 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDQ2OTQ0NzYzMg== | nbren12 1386642 | 2019-03-04T22:27:57Z | 2019-03-04T22:27:57Z | CONTRIBUTOR | @max-sixty I tend to agree this use case could be outside of the scope of xarray. It sounds like significant progress might require re-implementing core |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
469443856 | https://github.com/pydata/xarray/issues/2799#issuecomment-469443856 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDQ2OTQ0Mzg1Ng== | nbren12 1386642 | 2019-03-04T22:15:49Z | 2019-03-04T22:15:49Z | CONTRIBUTOR | Thanks so much @shoyer. I didn't realize there was that much overhead for a single function call. OTOH, 2x slower than numpy would be way better than 1000x. After looking at the profiling info more, I tend to agree with your 10x maximum speed-up. A couple of particularly slow functions (e.g. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 | |
469394020 | https://github.com/pydata/xarray/issues/2799#issuecomment-469394020 | https://api.github.com/repos/pydata/xarray/issues/2799 | MDEyOklzc3VlQ29tbWVudDQ2OTM5NDAyMA== | nbren12 1386642 | 2019-03-04T19:45:11Z | 2019-03-04T19:45:11Z | CONTRIBUTOR | cc @rabernat |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Performance: numpy indexes small amounts of data 1000 faster than xarray 416962458 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3