home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 775875024 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • dcherian 2

issue 1

  • Slow initilization of dataset.interp · 2 ✖

author_association 1

  • MEMBER 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
752261269 https://github.com/pydata/xarray/issues/4739#issuecomment-752261269 https://api.github.com/repos/pydata/xarray/issues/4739 MDEyOklzc3VlQ29tbWVudDc1MjI2MTI2OQ== dcherian 2448579 2020-12-29T22:19:29Z 2020-12-29T22:19:29Z MEMBER

But some time could be saved if we could convert them to dask arrays in xr.Dataset.interp before the variable loop starts.

Now implemented. Runtime has dropped from 5.3s to 2.3s (!)

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 2,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Slow initilization of dataset.interp 775875024
752210306 https://github.com/pydata/xarray/issues/4739#issuecomment-752210306 https://api.github.com/repos/pydata/xarray/issues/4739 MDEyOklzc3VlQ29tbWVudDc1MjIxMDMwNg== dcherian 2448579 2020-12-29T19:13:27Z 2020-12-29T19:33:07Z MEMBER

We don't support lazy index variables yet (#1603) so you can't interpolate to a dask variable.

But some time could be saved if we could convert them to dask arrays in xr.Dataset.interp before the variable loop starts.

This may be true. I think we could convert x and destination to dask (only once) if any of the variables to be interpolated are dask-arrays and pass that to interp_func here rather than passing IndexVariables through. https://github.com/pydata/xarray/blob/bf0fe2caca1d2ebc4f1298f019758baa12f68b94/xarray/core/missing.py#L641-L643

OTOH I found some easier optimizations. See #4740

  1. Passing meta to blockwise saves 0.5s in your example.
  2. Another thing we can do is call _localize at the Dataset level rather than within the variable loop. This is taking 1.65s most of which is in 4000 calls to get_loc. At the Dataset level, this becomes just 2 calls to get_loc
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Slow initilization of dataset.interp 775875024

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.401ms · About: xarray-datasette