home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where author_association = "CONTRIBUTOR" and user = 6181563 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 3

  • lazily load dask arrays to dask data frames by calling to_dask_dataframe 7
  • use dask to open datasets in parallel 2
  • Dataset.to_dataframe loads dask arrays into memory 1

user 1

  • jmunroe · 10 ✖

author_association 1

  • CONTRIBUTOR · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
373806224 https://github.com/pydata/xarray/issues/1981#issuecomment-373806224 https://api.github.com/repos/pydata/xarray/issues/1981 MDEyOklzc3VlQ29tbWVudDM3MzgwNjIyNA== jmunroe 6181563 2018-03-16T18:34:19Z 2018-03-16T18:34:19Z CONTRIBUTOR

distributed

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  use dask to open datasets in parallel 304201107
373794415 https://github.com/pydata/xarray/issues/1981#issuecomment-373794415 https://api.github.com/repos/pydata/xarray/issues/1981 MDEyOklzc3VlQ29tbWVudDM3Mzc5NDQxNQ== jmunroe 6181563 2018-03-16T17:53:44Z 2018-03-16T17:53:44Z CONTRIBUTOR

For what's worth, this is exactly the workflow I use (https://github.com/OceansAus/cosima-cookbook) when opening a large number of netCDF files:

    bag = dask.bag.from_sequence(ncfiles)

    load_variable = lambda ncfile: xr.open_dataset(ncfile, 
                       chunks=chunks, 
                       decode_times=False)[variables]

    bag = bag.map(load_variable)

    dataarrays = bag.compute()

and then

dataarray = xr.concat(dataarrays,
                      dim='time', coords='all', )

and it appears to work well.

Code snippets from cosima-cookbook/cosima_cookbook/netcdf_index.py

{
    "total_count": 3,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  use dask to open datasets in parallel 304201107
340127569 https://github.com/pydata/xarray/pull/1489#issuecomment-340127569 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDM0MDEyNzU2OQ== jmunroe 6181563 2017-10-28T00:46:58Z 2017-10-28T00:46:58Z CONTRIBUTOR

@shoyer Sound good. Thanks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
338118973 https://github.com/pydata/xarray/pull/1489#issuecomment-338118973 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMzODExODk3Mw== jmunroe 6181563 2017-10-20T06:36:43Z 2017-10-20T06:36:43Z CONTRIBUTOR

I don't understand how only test (TestDataArrayAndDataset::test_to_dask_dataframe_2D) can pass on TravisCI yet fail on Appveyor.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
335719811 https://github.com/pydata/xarray/pull/1489#issuecomment-335719811 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMzNTcxOTgxMQ== jmunroe 6181563 2017-10-11T07:55:44Z 2017-10-11T07:55:44Z CONTRIBUTOR

Hi @shoyer and @jhamman . Thanks for your patience. Please let me know if there is still anything needed to be done on this PR.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
335316753 https://github.com/pydata/xarray/pull/1489#issuecomment-335316753 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMzNTMxNjc1Mw== jmunroe 6181563 2017-10-09T23:28:45Z 2017-10-09T23:28:45Z CONTRIBUTOR

Hi @jhamman. Thanks for the nudge. I'll look at this again today and either a) just get it done or b) ask for help where needed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
327073701 https://github.com/pydata/xarray/pull/1489#issuecomment-327073701 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMyNzA3MzcwMQ== jmunroe 6181563 2017-09-05T05:19:08Z 2017-09-05T05:19:08Z CONTRIBUTOR

Sorry for the delay. I think this task is now complete.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
322022140 https://github.com/pydata/xarray/pull/1489#issuecomment-322022140 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMyMjAyMjE0MA== jmunroe 6181563 2017-08-13T05:04:11Z 2017-08-13T05:04:11Z CONTRIBUTOR

I agree that using dask.dataframe.from_array and dask.dataframe.concat should work. Sorry I haven't had a chance to get back to this recently. I'll try to make the change early next week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
318246117 https://github.com/pydata/xarray/pull/1489#issuecomment-318246117 https://api.github.com/repos/pydata/xarray/issues/1489 MDEyOklzc3VlQ29tbWVudDMxODI0NjExNw== jmunroe 6181563 2017-07-27T03:08:35Z 2017-07-27T03:08:35Z CONTRIBUTOR

After working on this for a little while, I agree that this really should be a to_dask_dataframe() method. I'll make that change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  lazily load dask arrays to dask data frames by calling to_dask_dataframe  245624267
317917228 https://github.com/pydata/xarray/issues/1462#issuecomment-317917228 https://api.github.com/repos/pydata/xarray/issues/1462 MDEyOklzc3VlQ29tbWVudDMxNzkxNzIyOA== jmunroe 6181563 2017-07-26T01:09:05Z 2017-07-26T01:09:05Z CONTRIBUTOR

Today, I find myself in need exact functionality. Assuming no one else is working on it, I'll give a shot at trying to fix this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset.to_dataframe loads dask arrays into memory 237710101

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.156ms · About: xarray-datasette