home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "MEMBER", issue = 1247010680 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 3 ✖

issue 1

  • Opening dataset without loading any indexes? · 3 ✖

author_association 1

  • MEMBER · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1137839614 https://github.com/pydata/xarray/issues/6633#issuecomment-1137839614 https://api.github.com/repos/pydata/xarray/issues/6633 IC_kwDOAMm_X85D0g3- shoyer 1217238 2022-05-25T20:55:14Z 2022-05-25T20:55:14Z MEMBER

Looking at this mur-sst dataset in particular, it stores time in chunks of size 5. That means fetching the 6443 time values requires 1288 separate HTTP requests -- no wonder it's so slow! If the time axis were instead stored in a single chunk of 51 KB, Xarray would only need 3 small size HTTP requests to load the lat, lon and time indexes, which would probably complete in a fraction of a second.

That said, I agree that this would be nice to have in general.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Opening dataset without loading any indexes? 1247010680
1137754031 https://github.com/pydata/xarray/issues/6633#issuecomment-1137754031 https://api.github.com/repos/pydata/xarray/issues/6633 IC_kwDOAMm_X85D0L-v shoyer 1217238 2022-05-25T19:12:40Z 2022-05-25T19:12:40Z MEMBER

but another option (post explicit index refactor) might be an option for opening a dataset without creating indexes for 1D coordinates along dimensions.

It might indeed be worth considering this case too in #6392. Maybe indexes=None (default) to create default indexes for 1D coordinates and indexes={} (empty dictionary) to explicitly skip creating indexes?

+1 this syntax makes sense to me!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Opening dataset without loading any indexes? 1247010680
1137572812 https://github.com/pydata/xarray/issues/6633#issuecomment-1137572812 https://api.github.com/repos/pydata/xarray/issues/6633 IC_kwDOAMm_X85DzfvM shoyer 1217238 2022-05-25T17:10:04Z 2022-05-25T17:10:04Z MEMBER

Early versions of Xarray used to have lazy loading of data for indexes, but we removed this for the sake of simplicity. In principle we could restore lazy indexes, but another option (post explicit index refactor) might be an option for opening a dataset without creating indexes for 1D coordinates along dimensions.

Another way to solve this sort of challenges might be to load index data in parallel when using Dask. Right now I believe the data corresponding to indexes is always loaded eagerly, without using Dask.

All that said -- Do you have a specific example where this has been problematic? In my experience it has been pretty reasonable to use xarray.Dataset objects for schema-like templates, even with index data needing to be loaded eagerly. Possibly another Zarr chunking scheme for your index data could be more efficient?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Opening dataset without loading any indexes? 1247010680

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 223.967ms · About: xarray-datasette