issue_comments
6 rows where issue = 226549366 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- `decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed · 6 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
300072972 | https://github.com/pydata/xarray/issues/1399#issuecomment-300072972 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDMwMDA3Mjk3Mg== | cchwala 102827 | 2017-05-09T06:26:36Z | 2017-05-09T06:26:36Z | CONTRIBUTOR | Okay. I will try to come up with a PR within the next days. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 | |
299916837 | https://github.com/pydata/xarray/issues/1399#issuecomment-299916837 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDI5OTkxNjgzNw== | shoyer 1217238 | 2017-05-08T16:24:50Z | 2017-05-08T16:24:50Z | MEMBER |
@spencerkclark has been working on patch to natively support other datetime precisions in xarray (see https://github.com/pydata/xarray/pull/1252).
For better or worse, NumPy's datetime64 ignores leap seconds.
This sounds pretty reasonable to me. The main challenge here will be guarding against integer overflow -- you might need to do the math twice, once with floats (to check for overflow) and then with integers. You could also experiment with doing the conversion with NumPy instead of pandas, using |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 | |
299830748 | https://github.com/pydata/xarray/issues/1399#issuecomment-299830748 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDI5OTgzMDc0OA== | fmaussion 10050469 | 2017-05-08T10:26:26Z | 2017-05-08T10:26:26Z | MEMBER |
yes, you can ignore my comment! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 | |
299819380 | https://github.com/pydata/xarray/issues/1399#issuecomment-299819380 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDI5OTgxOTM4MA== | cchwala 102827 | 2017-05-08T09:32:58Z | 2017-05-08T09:32:58Z | CONTRIBUTOR | Hmm... The "nanosecond"-issue seems to need a fix very much at the foundation. As long as pandas and xarray rely on An intermediate fix (@shoyer, do you actually want one?) that I could think of for the performance issue right now would be to do the conversion to
The only thing that bothers me is that I am not sure if the "number of nanoseconds" is always the same in every day or hour in the view of @shoyer: Does this sound reasonable or did I forget to take into account any side effects? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 | |
299510444 | https://github.com/pydata/xarray/issues/1399#issuecomment-299510444 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDI5OTUxMDQ0NA== | shoyer 1217238 | 2017-05-05T16:23:17Z | 2017-05-05T16:23:17Z | MEMBER | Good catch! We should definitely speed this up.
Yes, very much agreed. For units such as months or years, we already are giving the wrong result when we use pandas:
Yes, this might also work. I no longer recall why we cast all inputs to floats (maybe just for consistency), but I suspect that that one of our time conversion libraries (probably netCDF4/netcdftime) expects a float array. Certainly we will still need to support floating point times saved in netCDF files, which are pretty common in my experience. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 | |
299483553 | https://github.com/pydata/xarray/issues/1399#issuecomment-299483553 | https://api.github.com/repos/pydata/xarray/issues/1399 | MDEyOklzc3VlQ29tbWVudDI5OTQ4MzU1Mw== | fmaussion 10050469 | 2017-05-05T14:42:19Z | 2017-05-05T14:42:19Z | MEMBER | Hi Christian!
This sounds much less error prone to me. In particular, I am getting a bit nervous when I hear "nanoseconds" ;-) (see https://github.com/pydata/xarray/issues/789) |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
`decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3