home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER", issue = 98274024 and user = 1217238 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • shoyer · 5 ✖

issue 1

  • ENH: where method for masking xray objects according to some criteria · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
126851008 https://github.com/pydata/xarray/pull/504#issuecomment-126851008 https://api.github.com/repos/pydata/xarray/issues/504 MDEyOklzc3VlQ29tbWVudDEyNjg1MTAwOA== shoyer 1217238 2015-08-01T02:22:59Z 2015-08-01T02:22:59Z MEMBER

I moved the docs around and added a note on multi-dimensional indexing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: where method for masking xray objects according to some criteria 98274024
126815158 https://github.com/pydata/xarray/pull/504#issuecomment-126815158 https://api.github.com/repos/pydata/xarray/issues/504 MDEyOklzc3VlQ29tbWVudDEyNjgxNTE1OA== shoyer 1217238 2015-07-31T21:22:36Z 2015-07-31T21:22:36Z MEMBER

Oh, wow -- I didn't even realize that worked in pandas! Combined with NA-skipping aggregation functions in pandas that makes expressions like a[a < 0].mean() work just like the same expression in NumPy.

So instead of adding where, perhaps we should just support boolean indexing like pandas.

The main difference is that where can cleanly support broadcasting, whereas we currently don't do broadcasting in indexing. For example, suppose a is a 2-dimensional DataArray with dimensions (x, y). Now considering the following cases: 1. a[x > 0] 2. a[y > 0] 3. a[x > 0, y > 0] 4. a[(x > 0) & (y > 0)]

Currently, (1) and (3) work by selection. If we adopt the pandas behavior, (4) would also work, but by broadcasting and masking. This seems like a potential recipe for confusion, because once you have (4), case (2) seems like a natural variation. We could implement (2), but should it mask or select?

My sense is that we'll probably be happier if we have entirely distinct APIs for masking (.where) and selection ([] and .loc[]).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: where method for masking xray objects according to some criteria 98274024
126784623 https://github.com/pydata/xarray/pull/504#issuecomment-126784623 https://api.github.com/repos/pydata/xarray/issues/504 MDEyOklzc3VlQ29tbWVudDEyNjc4NDYyMw== shoyer 1217238 2015-07-31T18:58:16Z 2015-07-31T18:58:16Z MEMBER

Both R and pandas allow the user to do a[a < 0] and a[a < 0] = 0. So what I'm wondering is why not extend xray's indexing to also work on arrays that are the same shape and have the same labels as the original array?

The problem is that a[a < 0] in NumPy flattens arrays with more than one dimension. We can't do this in xray without doing something like pointwise indexing to flatten out the labels, too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: where method for masking xray objects according to some criteria 98274024
126772960 https://github.com/pydata/xarray/pull/504#issuecomment-126772960 https://api.github.com/repos/pydata/xarray/issues/504 MDEyOklzc3VlQ29tbWVudDEyNjc3Mjk2MA== shoyer 1217238 2015-07-31T18:05:22Z 2015-07-31T18:10:59Z MEMBER

Right now, you can do that by chaining two operations: a.where(a > 0).fillna(0). It would definitely be better to support it in one (a.where(a > 0, 0)).

I suppose we could also support a[a < 0] = 0, but it seems a little strange given that we don't support a[a < 0].

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: where method for masking xray objects according to some criteria 98274024
126770381 https://github.com/pydata/xarray/pull/504#issuecomment-126770381 https://api.github.com/repos/pydata/xarray/issues/504 MDEyOklzc3VlQ29tbWVudDEyNjc3MDM4MQ== shoyer 1217238 2015-07-31T17:51:49Z 2015-07-31T17:51:49Z MEMBER

@clarkfitzg Yes, that's mostly right. The main differences: 1. The order of the arguments here is different, to match the pandas methods (which has more of a SQL flavor to it). 2. I'm not exposing the third argument, because xray objects don't yet implement broadcasting operations with more than 2 arguments at once. This is something that needs refactoring -- the logic in _binary_op should be generalized to any number of arguments.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: where method for masking xray objects according to some criteria 98274024

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 186.037ms · About: xarray-datasette