home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 187872991 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • shoyer 3
  • dcherian 1

issue 1

  • Convert xarray dataset to dask dataframe or delayed objects · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
509773034 https://github.com/pydata/xarray/issues/1093#issuecomment-509773034 https://api.github.com/repos/pydata/xarray/issues/1093 MDEyOklzc3VlQ29tbWVudDUwOTc3MzAzNA== dcherian 2448579 2019-07-09T19:20:51Z 2019-07-09T19:20:51Z MEMBER

I think this was closed by mistake. Is there a way to split up Dataset chunks into dask delayed objects where each object is a Dataset?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Convert xarray dataset to dask dataframe or delayed objects 187872991
259213382 https://github.com/pydata/xarray/issues/1093#issuecomment-259213382 https://api.github.com/repos/pydata/xarray/issues/1093 MDEyOklzc3VlQ29tbWVudDI1OTIxMzM4Mg== shoyer 1217238 2016-11-08T18:09:11Z 2016-11-08T18:09:34Z MEMBER

The other component that would help for this is some utility function inside xarray to split a Dataset (or DataArray) into sub-datasets for each chunk. Something like:

python def split_by_chunks(dataset): chunk_slices = {} for dim, chunks in dataset.chunks.items(): slices = [] start = 0 for chunk in chunks: stop = start + chunk slices.append(slice(start, stop)) start = stop chunk_slices[dim] = slices for slices in itertools.product(*chunk_slices.values()): selection = dict(zip(chunk_slices.keys(), slices)) yield (selection, dataset[selection])

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  Convert xarray dataset to dask dataframe or delayed objects 187872991
259207151 https://github.com/pydata/xarray/issues/1093#issuecomment-259207151 https://api.github.com/repos/pydata/xarray/issues/1093 MDEyOklzc3VlQ29tbWVudDI1OTIwNzE1MQ== shoyer 1217238 2016-11-08T17:46:23Z 2016-11-08T17:46:23Z MEMBER

Can you explain why you think this could benefit from collection duck typing?

Then we could use xarray's normal indexing operations to create a new sub-datasets, wrap them with dask.delayed and start chaining on delayed method calls like to_dataframe. The duck typing is necessary so that dask.delayed knows how to pull the dask graph out from the input Dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Convert xarray dataset to dask dataframe or delayed objects 187872991
259052436 https://github.com/pydata/xarray/issues/1093#issuecomment-259052436 https://api.github.com/repos/pydata/xarray/issues/1093 MDEyOklzc3VlQ29tbWVudDI1OTA1MjQzNg== shoyer 1217238 2016-11-08T05:55:19Z 2016-11-08T05:55:19Z MEMBER

CC @mrocklin @jcrist

This is a good use case for dask collection duck typing: https://github.com/dask/dask/pull/1068

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Convert xarray dataset to dask dataframe or delayed objects 187872991

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.271ms · About: xarray-datasette