home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER" and issue = 377075253 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • shoyer 3
  • jhamman 2

issue 1

  • Stop loading tutorial data by default · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
435738466 https://github.com/pydata/xarray/pull/2538#issuecomment-435738466 https://api.github.com/repos/pydata/xarray/issues/2538 MDEyOklzc3VlQ29tbWVudDQzNTczODQ2Ng== jhamman 2443309 2018-11-05T02:39:50Z 2018-11-05T02:39:50Z MEMBER

@shoyer - I think I was tracking with you. I've gone ahead and deprecated the current load_dataset in favor of the open_dataset name. The switch is accompanied by a change in behavior as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stop loading tutorial data by default 377075253
435732988 https://github.com/pydata/xarray/pull/2538#issuecomment-435732988 https://api.github.com/repos/pydata/xarray/issues/2538 MDEyOklzc3VlQ29tbWVudDQzNTczMjk4OA== shoyer 1217238 2018-11-05T01:59:34Z 2018-11-05T01:59:34Z MEMBER

The default behavior should cache the arrays loaded with NumPy anyways.

Sorry, to be clear what I meant here is that by default arrays loaded with NumPy get cached after the first/access/operation. Not that we need to preserve the existing behavior of load_dataset().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stop loading tutorial data by default 377075253
435688958 https://github.com/pydata/xarray/pull/2538#issuecomment-435688958 https://api.github.com/repos/pydata/xarray/issues/2538 MDEyOklzc3VlQ29tbWVudDQzNTY4ODk1OA== shoyer 1217238 2018-11-04T17:29:11Z 2018-11-04T17:29:11Z MEMBER

OK, that seems reasonable. The default behavior should cache the arrays loaded with NumPy anyways. I would not be opposed to renaming this to open_dataset, either. On Sun, Nov 4, 2018 at 9:19 AM Joe Hamman notifications@github.com wrote:

@shoyer https://github.com/shoyer - absolutely we'll get better performance with numpy arrays in this case. So I'm trying to use our tutorial datasets for some examples with dask (dask/dask-examples#51 https://github.com/dask/dask-examples/pull/51). The docstring for the load_dataset function states that we can pass kwargs on to the open_dataset function but if we pass chunks to the load_dataset call currently, we still get data back as numpy arrays. We have some other options here:

  1. if chunks is a kwargs, return a dataset with data as persisted dask arrays
  2. provide a second function to handle returning datasets using the same logic as open_dataset (caching, dask arrays, lazy loading, etc.)
  3. tell people (like me) to rechunk the dataset after the fact

(3) won't require any changes but makes it a little harder to connect the typical use pattern of open_dataset with tutorial.load_dataset.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/2538#issuecomment-435688104, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1mBjbk7l2qXi4EqFtMGdvDDoPJHaks5uryGUgaJpZM4YM5-d .

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stop loading tutorial data by default 377075253
435688104 https://github.com/pydata/xarray/pull/2538#issuecomment-435688104 https://api.github.com/repos/pydata/xarray/issues/2538 MDEyOklzc3VlQ29tbWVudDQzNTY4ODEwNA== jhamman 2443309 2018-11-04T17:19:15Z 2018-11-04T17:19:15Z MEMBER

@shoyer - absolutely we'll get better performance with numpy arrays in this case. So I'm trying to use our tutorial datasets for some examples with dask (dask/dask-examples#51). The docstring for the load_dataset function states that we can pass kwargs on to the open_dataset function but if we pass chunks to the load_dataset call currently, we still get data back as numpy arrays. We have some other options here:

  1. if chunks is a kwargs, return a dataset with data as persisted dask arrays
  2. provide a second function to handle returning datasets using the same logic as open_dataset (caching, dask arrays, lazy loading, etc.)
  3. tell people (like me) to rechunk the dataset after the fact

(3) won't require any changes but makes it a little harder to connect the typical use pattern of open_dataset with tutorial.load_dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stop loading tutorial data by default 377075253
435621566 https://github.com/pydata/xarray/pull/2538#issuecomment-435621566 https://api.github.com/repos/pydata/xarray/issues/2538 MDEyOklzc3VlQ29tbWVudDQzNTYyMTU2Ng== shoyer 1217238 2018-11-03T21:17:02Z 2018-11-03T21:17:02Z MEMBER

Our current tutorial datasets are 8MB and 17MB, which is pretty small. You'll definitely get better performance loading datasets of this size into NumPy arrays.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Stop loading tutorial data by default 377075253

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 10.543ms · About: xarray-datasette