home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

1 row where author_association = "MEMBER", issue = 274797981 and user = 4160723 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • benbovy · 1 ✖

issue 1

  • Switch our lazy array classes to use Dask instead? · 1 ✖

author_association 1

  • MEMBER · 1 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
345394914 https://github.com/pydata/xarray/issues/1725#issuecomment-345394914 https://api.github.com/repos/pydata/xarray/issues/1725 MDEyOklzc3VlQ29tbWVudDM0NTM5NDkxNA== benbovy 4160723 2017-11-17T23:39:33Z 2017-11-17T23:39:33Z MEMBER

I'm rather a numpy-xarray user than a dask-xarray user (since most often my data fits in memory), but I wouldn't mind at all having to install dask as a requirement!

Potentially chunks=False in open_dataset could indicate that you're OK loading everything into memory with NumPy. We would then have to choose between making the default use Dask or NumPy.

Maybe like other users who are used to lazy loading, I'm a bit more concerned by this. I find it so handy to be able to load a medium-sized file instantly, quickly inspect its content, and then work with only a small subset of the variables / data, all of this without worrying about chunks.

Assuming that numpy-loading is the default, new xarray users coming from netcdf4-python and who don't know much about dask might find xarray a very inefficient tool when trying to first import a medium-sized netcdf file.

If choosing chunks=False as default, I can also imagine often forgetting to set it to True when loading a file that is a bit too big to load in memory... That would be annoying.

By saying "making the default use Dask", do you mean that data from a file will be "loaded" as dask arrays by default? If this is the case, new xarray users which are probably not familiar with dask (at least less likely than they are familiar with numpy) will have to learn 1-2 concepts from dask before using xarray. This might not be a big deal, though.

In summary, I'm also really not opposed to use dask to replace all the current lazy-loading machinery, but ideally it should be as transparent as possible with respect to the current "user experience".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Switch our lazy array classes to use Dask instead? 274797981

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 33.753ms · About: xarray-datasette