issue_comments
2 rows where issue = 592312709 and user = 1217238 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- sel along 1D non-index coordinates · 2 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
609472798 | https://github.com/pydata/xarray/pull/3925#issuecomment-609472798 | https://api.github.com/repos/pydata/xarray/issues/3925 | MDEyOklzc3VlQ29tbWVudDYwOTQ3Mjc5OA== | shoyer 1217238 | 2020-04-05T19:53:44Z | 2020-04-05T19:53:44Z | MEMBER | Related to my microbenchmark, it might also be worth considering pure NumPy versions of common indexing operations, to avoid the need to repeatedly create hash-tables. But that could be quite a bit of work to do comprehensively. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
sel along 1D non-index coordinates 592312709 | |
609471635 | https://github.com/pydata/xarray/pull/3925#issuecomment-609471635 | https://api.github.com/repos/pydata/xarray/issues/3925 | MDEyOklzc3VlQ29tbWVudDYwOTQ3MTYzNQ== | shoyer 1217238 | 2020-04-05T19:45:07Z | 2020-04-05T19:45:07Z | MEMBER | I think this is generally a good idea! In the future, creating an One minor concern I have here is about efficiency: building an Here's an microbenchmark that hopefully illustrates the issue: ``` import pandas as pd import numpy as np def lookup_preindexed(needle, index): return index.get_loc(needle) def lookup_newindex(needle, haystack): return lookup_preindexed(needle, pd.Index(haystack)) def lookup_numpy(needle, haystack): return (haystack == needle).argmax() haystack = np.random.permutation(np.arange(1000000)) index = pd.Index(haystack) %timeit lookup_newindex(0, haystack) # 56.1 ms per loop %timeit lookup_preindexed(0, index) # 696 ns per loop %timeit lookup_numpy(0, haystack) # 517 µs per loop ``` pandas is 1000x faster than NumPy if the index is pre-existing, but 100x slower if the index is new. That's a 1e5 fold slow-down! I think users will appreciate the flexibility, but if there's some way we warn users that they really should set the index ahead of time when they are doing repeating indexing that could also be welcome. Figuring out how to save the state for counting the number of times a new index is created could be pretty messy, though. I guess we could stuff it into |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
sel along 1D non-index coordinates 592312709 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1