home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 824329772

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1887#issuecomment-824329772 https://api.github.com/repos/pydata/xarray/issues/1887 824329772 MDEyOklzc3VlQ29tbWVudDgyNDMyOTc3Mg== 1217238 2021-04-21T20:16:10Z 2021-04-21T20:16:10Z MEMBER

I've been trying to conceptualize why I think the where equivalence (the original proposal) is better than the stack proposal (the latter).

Here are two reasons why I like the stack version:

  1. It's more NumPy like -- boolean indexing in NumPy returns a flat array in the same way
  2. It doesn't need dtype promotion to handle possibly missing values, so it will have more predictable semantics.

As a side note: one nice feature of using isel() for stacking is that it does not create a MultiIndex, which can be expensive. But there's no reason why we necessarily need to do that for stack(). I'll open a new issue to discuss adding an optional parameter.

  • I'm not sure how the setitem would work; da[key] = value?

To match the semantics of NumPy, value would need to have matching dims/coords to those of da[key]. In other words, it would also need to be stacked.

  • If someone wants the stack result, it's less work to do original -> where result -> stack result relative to original -> stack result -> where result; which suggests they're more composable?

I'm not quite sure this is true -- it's the difference between needing to call stack() vs unstack().

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  294241734
Powered by Datasette · Queries took 77.685ms · About: xarray-datasette