home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 126815158

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/504#issuecomment-126815158 https://api.github.com/repos/pydata/xarray/issues/504 126815158 MDEyOklzc3VlQ29tbWVudDEyNjgxNTE1OA== 1217238 2015-07-31T21:22:36Z 2015-07-31T21:22:36Z MEMBER

Oh, wow -- I didn't even realize that worked in pandas! Combined with NA-skipping aggregation functions in pandas that makes expressions like a[a < 0].mean() work just like the same expression in NumPy.

So instead of adding where, perhaps we should just support boolean indexing like pandas.

The main difference is that where can cleanly support broadcasting, whereas we currently don't do broadcasting in indexing. For example, suppose a is a 2-dimensional DataArray with dimensions (x, y). Now considering the following cases: 1. a[x > 0] 2. a[y > 0] 3. a[x > 0, y > 0] 4. a[(x > 0) & (y > 0)]

Currently, (1) and (3) work by selection. If we adopt the pandas behavior, (4) would also work, but by broadcasting and masking. This seems like a potential recipe for confusion, because once you have (4), case (2) seems like a natural variation. We could implement (2), but should it mask or select?

My sense is that we'll probably be happier if we have entirely distinct APIs for masking (.where) and selection ([] and .loc[]).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  98274024
Powered by Datasette · Queries took 0.896ms · About: xarray-datasette