home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "CONTRIBUTOR" and issue = 226549366 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • cchwala 2

issue 1

  • `decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed · 2 ✖

author_association 1

  • CONTRIBUTOR · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
300072972 https://github.com/pydata/xarray/issues/1399#issuecomment-300072972 https://api.github.com/repos/pydata/xarray/issues/1399 MDEyOklzc3VlQ29tbWVudDMwMDA3Mjk3Mg== cchwala 102827 2017-05-09T06:26:36Z 2017-05-09T06:26:36Z CONTRIBUTOR

Okay. I will try to come up with a PR within the next days.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366
299819380 https://github.com/pydata/xarray/issues/1399#issuecomment-299819380 https://api.github.com/repos/pydata/xarray/issues/1399 MDEyOklzc3VlQ29tbWVudDI5OTgxOTM4MA== cchwala 102827 2017-05-08T09:32:58Z 2017-05-08T09:32:58Z CONTRIBUTOR

Hmm... The "nanosecond"-issue seems to need a fix very much at the foundation. As long as pandas and xarray rely on datetime64[ns] you cannot avoid nanoseconds, right? pd.to_datetime() forces the conversion to nanoscends even if you pass integers but for a time unit different to ns. This does not make me as nervous as Fabien since my data is always quite recent, but I see that this is far from ideal for a tool for climate scientists.

An intermediate fix (@shoyer, do you actually want one?) that I could think of for the performance issue right now would be to do the conversion to datetime64[ns] depending on the time unit, e.g.

  • multiply raw values (most likely floats) with number of nanoseconds in time unit for units smaller then days (or hours?) and use these values as integers in pd.to_datetime()
  • else, fall back to using netCDF4/netcdftime for months and years (as suggested by shoyer) casting the raw values to floats

The only thing that bothers me is that I am not sure if the "number of nanoseconds" is always the same in every day or hour in the view of datetime64, due to leap seconds or other particularities.

@shoyer: Does this sound reasonable or did I forget to take into account any side effects?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `decode_cf_datetime()` slow because `pd.to_timedelta()` is slow if floats are passed 226549366

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 316.868ms · About: xarray-datasette