issue_comments
3 rows where author_association = "MEMBER" and issue = 365678022 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- DataArray.sel extremely slow · 3 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
426335414 | https://github.com/pydata/xarray/issues/2452#issuecomment-426335414 | https://api.github.com/repos/pydata/xarray/issues/2452 | MDEyOklzc3VlQ29tbWVudDQyNjMzNTQxNA== | max-sixty 5635139 | 2018-10-02T16:15:00Z | 2018-10-02T16:15:00Z | MEMBER | Thanks @mschrimpf. Hopefully we can get multi-dimensional groupbys, too. |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
DataArray.sel extremely slow 365678022 | |
426106046 | https://github.com/pydata/xarray/issues/2452#issuecomment-426106046 | https://api.github.com/repos/pydata/xarray/issues/2452 | MDEyOklzc3VlQ29tbWVudDQyNjEwNjA0Ng== | max-sixty 5635139 | 2018-10-02T00:21:17Z | 2018-10-02T00:21:17Z | MEMBER |
I can't think of anything immediately, and doubt there's an easy way given it doesn't exist yet (though that logic can be a trap!). There's some hacky pandas reshaping you may be able to do to solve this as a one-off. Otherwise it does likely require a concerted effort with numbagg. I occasionally hit this issue too, so as keen as you are to find a solution. Thanks for giving it a try. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
DataArray.sel extremely slow 365678022 | |
426096521 | https://github.com/pydata/xarray/issues/2452#issuecomment-426096521 | https://api.github.com/repos/pydata/xarray/issues/2452 | MDEyOklzc3VlQ29tbWVudDQyNjA5NjUyMQ== | max-sixty 5635139 | 2018-10-01T23:25:01Z | 2018-10-01T23:25:01Z | MEMBER | Thanks for the issue @mschrimpf
While there's an overhead, the time is fairly consistent regardless of the number of items it's selecting. For example:
So, as is often the case in the pandas / python ecosystem, if you can write code in a vectorized way, without using python in the tight loops, it's fast. If you need to run python in each loop, it's much slower. Does that resonate? While I think not the main point here, there might be some optimizations on
```
1077 function calls (1066 primitive calls) in 0.002 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
6 0.000 0.000 0.000 0.000 coordinates.py:169(<genexpr>)
13 0.000 0.000 0.000 0.000 collections.py:50(__init__)
14 0.000 0.000 0.000 0.000 _abcoll.py:548(update)
33 0.000 0.000 0.000 0.000 _weakrefset.py:70(__contains__)
2 0.000 0.000 0.001 0.000 dataset.py:881(_construct_dataarray)
144 0.000 0.000 0.000 0.000 {isinstance}
1 0.000 0.000 0.001 0.001 dataset.py:1496(isel)
18 0.000 0.000 0.000 0.000 {numpy.core.multiarray.array}
3 0.000 0.000 0.000 0.000 dataset.py:92(calculate_dimensions)
13 0.000 0.000 0.000 0.000 abc.py:128(__instancecheck__)
36 0.000 0.000 0.000 0.000 common.py:183(__setattr__)
2 0.000 0.000 0.000 0.000 coordinates.py:167(variables)
2 0.000 0.000 0.000 0.000 {method 'get_loc' of 'pandas._libs.index.IndexEngine' objects}
26 0.000 0.000 0.000 0.000 variable.py:271(shape)
65 0.000 0.000 0.000 0.000 collections.py:90(__iter__)
5 0.000 0.000 0.000 0.000 variable.py:136(as_compatible_data)
3 0.000 0.000 0.000 0.000 dataarray.py:165(__init__)
2 0.000 0.000 0.000 0.000 indexing.py:1255(__getitem__)
3 0.000 0.000 0.000 0.000 variable.py:880(isel)
14 0.000 0.000 0.000 0.000 collections.py:71(__setitem__)
1 0.000 0.000 0.000 0.000 dataset.py:1414(_validate_indexers)
6 0.000 0.000 0.000 0.000 coordinates.py:38(__iter__)
3 0.000 0.000 0.000 0.000 variable.py:433(_broadcast_indexes)
2 0.000 0.000 0.000 0.000 variable.py:1826(to_index)
3 0.000 0.000 0.000 0.000 dataset.py:636(_construct_direct)
2 0.000 0.000 0.000 0.000 indexing.py:122(convert_label_indexer)
15 0.000 0.000 0.000 0.000 utils.py:306(__init__)
3 0.000 0.000 0.000 0.000 indexing.py:17(expanded_indexer)
28 0.000 0.000 0.000 0.000 collections.py:138(iteritems)
1 0.000 0.000 0.001 0.001 indexing.py:226(remap_label_indexers)
15 0.000 0.000 0.000 0.000 numeric.py:424(asarray)
1 0.000 0.000 0.001 0.001 indexing.py:193(get_dim_indexers)
80/70 0.000 0.000 0.000 0.000 {len}
```
|
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
DataArray.sel extremely slow 365678022 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1