home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "MEMBER" and issue = 588112617 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • shoyer 3
  • max-sixty 3
  • dcherian 2
  • keewis 1

issue 1

  • Add public API for Dataset._copy_listed · 9 ✖

author_association 1

  • MEMBER · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
676669386 https://github.com/pydata/xarray/issues/3894#issuecomment-676669386 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NjY2OTM4Ng== shoyer 1217238 2020-08-19T20:32:24Z 2020-08-19T20:32:24Z MEMBER

At most, I would require using the new method if you want your code to type-check properly.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
676656104 https://github.com/pydata/xarray/issues/3894#issuecomment-676656104 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NjY1NjEwNA== keewis 14808389 2020-08-19T20:23:53Z 2020-08-19T20:23:53Z MEMBER

IIUC, what we're discussing here is adding a new method that treats all sequences the same (__getitem__ won't be touched, except maybe the type hints).

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675125293 https://github.com/pydata/xarray/issues/3894#issuecomment-675125293 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTEyNTI5Mw== max-sixty 5635139 2020-08-17T21:29:22Z 2020-08-17T21:29:22Z MEMBER

A level down, re the name — I thought .vars might be decent — but potentially it's too similar to .variables — which is a mapping to the actual variables (and can't take multiple selections)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675124770 https://github.com/pydata/xarray/issues/3894#issuecomment-675124770 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTEyNDc3MA== max-sixty 5635139 2020-08-17T21:27:59Z 2020-08-17T21:27:59Z MEMBER

I do think having a Dataset behave similarly to a dict / Mapping is generally good, and allows new users to bring existing understandings around those data structures to the xarray data model.

I recognize that a hashable iterable (e.g. ('a', 'b', 'c') is an annoying corner case, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675065140 https://github.com/pydata/xarray/issues/3894#issuecomment-675065140 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTA2NTE0MA== dcherian 2448579 2020-08-17T19:20:24Z 2020-08-17T19:20:24Z MEMBER

I think avoiding sel is a good idea (but no strong thoughts here).

how about pick_vars or take_vars?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
675058654 https://github.com/pydata/xarray/issues/3894#issuecomment-675058654 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NTA1ODY1NA== shoyer 1217238 2020-08-17T19:05:47Z 2020-08-17T19:05:47Z MEMBER

It would be better to have an explicit method for subsetting Dataset variables.

I agree. sel_vars is more clear IMO since subset could apply to the coordinates too e.g. a spatial subset.

We did a similar splitting of functionality recently with drop, into drop_vars and drop_sel.

So this would leave us with:

  • sel/drop_sel for indices
  • sel_vars/drop_vars for variables

The naming doesn't have an obvious pattern here, which seems non-ideal. I can't think of anything much better at the moment, but perhaps it would help to avoid reusing sel. Maybe get_vars or subset_vars?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
674440470 https://github.com/pydata/xarray/issues/3894#issuecomment-674440470 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NDQ0MDQ3MA== shoyer 1217238 2020-08-15T19:48:55Z 2020-08-15T19:48:55Z MEMBER

I agree, this API is too overloaded. It would be better to have an explicit method for subsetting Dataset variables. Maybe subset or sel_vars?

In early versions of xarray (back when it was called xray), we actually had a select method but I was concerned it was too confusing with indexing: http://xarray.pydata.org/en/v0.1.1/generated/xray.Dataset.select.html#xray.Dataset.select

What's the reasoning for not returning a Dataset when __getitem__ is passed an Iterable like _copy_listed?

The current check uses hashability to determine whether to try to make a DataArray. In theory, you could put a variable with the name ('a', 'b', 'c') into a Dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
674422560 https://github.com/pydata/xarray/issues/3894#issuecomment-674422560 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDY3NDQyMjU2MA== dcherian 2448579 2020-08-15T16:53:38Z 2020-08-15T16:53:38Z MEMBER

What's the reasoning for not returning a Dataset when __getitem__ is passed an Iterable like _copy_listed?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617
604506561 https://github.com/pydata/xarray/issues/3894#issuecomment-604506561 https://api.github.com/repos/pydata/xarray/issues/3894 MDEyOklzc3VlQ29tbWVudDYwNDUwNjU2MQ== max-sixty 5635139 2020-03-26T15:46:05Z 2020-03-26T15:46:05Z MEMBER

Would that be different from ensuring the input is a list?

Moreover, because Dataset__getitem__ is type unstable,

I very much empathize with the pain from methods being type unstable; indeed I think that's one of the biggest benefits of xarray over pandas. Here, it's stable over the same typed inputs. i.e. if supplied with a list, it returns with a dataset, otherwise it returns a DataArray. (or am I missing something?)

it makes it hard to detect this kind of error using mypy

Is there a way in mypy we could use something like overload to specify the above contract here, as an alternative to another method?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add public API for Dataset._copy_listed 588112617

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 240.916ms · About: xarray-datasette