html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1887#issuecomment-825176507,https://api.github.com/repos/pydata/xarray/issues/1887,825176507,MDEyOklzc3VlQ29tbWVudDgyNTE3NjUwNw==,5635139,2021-04-22T20:50:29Z,2021-04-22T21:06:47Z,MEMBER,"> `stack(new_dim=[""a"", ""b""], dropna=True)`

This could be useful (potentially we can open a different issue). While someone can call `.dropna`, that coerces to floats (or some type that supports missing) and can allocate more than is needed. Potentially this can be considered along with issues around sparse, e.g. https://github.com/pydata/xarray/issues/3245, https://github.com/pydata/xarray/issues/4143","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824503658,https://api.github.com/repos/pydata/xarray/issues/1887,824503658,MDEyOklzc3VlQ29tbWVudDgyNDUwMzY1OA==,5635139,2021-04-22T03:04:41Z,2021-04-22T03:04:51Z,MEMBER,"I'm still working through this. Using this to jot down my notes, no need to respond.

One property that seems to be lacking is that if `key` changes from `n-1` to `n` dimensions, the behavior changes (also outlined [here](url)):

```python
In [171]: a
Out[171]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [172]: mask
Out[172]: array([ True, False,  True])

In [173]: a[mask]
Out[173]:
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])
```

...as expected, but now let's make a 2D mask...

```python
In [174]: full_mask = np.broadcast_to(mask[:, np.newaxis], (3,4))

In [175]: full_mask
Out[175]:
array([[ True,  True,  True,  True],
       [False, False, False, False],
       [ True,  True,  True,  True]])

In [176]: a[full_mask]
Out[176]: array([ 0,  1,  2,  3,  8,  9, 10, 11])    # flattened!
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824461333,https://api.github.com/repos/pydata/xarray/issues/1887,824461333,MDEyOklzc3VlQ29tbWVudDgyNDQ2MTMzMw==,1217238,2021-04-22T01:02:32Z,2021-04-22T01:02:32Z,MEMBER,"> Current proposal (""`stack`""), of `da[key]` and with a dimension of `key`'s name (and probably no multiindex):
> 
> ```python
> In [86]: da.values[key.values]
> Out[86]: array([0, 3, 6, 9])   # But the xarray version
> ```

The part about this new proposal that is most annoying is that the `key` needs a `name`, which we can use to name the new dimension. That's not too hard to do, but it is little annoying -- in practice you would have to write something like  `da[key.rename('key_name')]` much of the time to make this work.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824460304,https://api.github.com/repos/pydata/xarray/issues/1887,824460304,MDEyOklzc3VlQ29tbWVudDgyNDQ2MDMwNA==,1217238,2021-04-22T00:59:25Z,2021-04-22T00:59:25Z,MEMBER,"> OK great. To confirm, this is what it would look like:

Yes, this looks right to me.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824454992,https://api.github.com/repos/pydata/xarray/issues/1887,824454992,MDEyOklzc3VlQ29tbWVudDgyNDQ1NDk5Mg==,5635139,2021-04-22T00:40:49Z,2021-04-22T00:40:49Z,MEMBER,"> I'm not quite sure this is true -- it's the difference between needing to call `stack()` vs `unstack()`.

This was a tiny point so it's fine to discard. I had meant that producing the `where` result via the `stack` result requires a `stack` and `unstack`. But producing the `stack` result via a `where` result requires only one `stack` — the `where` result is very cheap. 
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824452843,https://api.github.com/repos/pydata/xarray/issues/1887,824452843,MDEyOklzc3VlQ29tbWVudDgyNDQ1Mjg0Mw==,5635139,2021-04-22T00:33:29Z,2021-04-22T00:35:28Z,MEMBER,"OK great. To confirm, this is what it would look like:


Context:

```python
In [81]: da = xr.DataArray(np.arange(12).reshape(3,4), dims=list('ab'))

In [82]: da
Out[82]:
<xarray.DataArray (a: 3, b: 4)>
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
Dimensions without coordinates: a, b

In [84]: key = da % 3 == 0

In [83]: key
Out[83]:
<xarray.DataArray (a: 3, b: 4)>
array([[ True, False, False,  True],
       [False, False,  True, False],
       [False,  True, False, False]])
Dimensions without coordinates: a, b
```

Currently
```python

In [85]: da[key]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-85-7fd83c907cb6> in <module>
----> 1 da[key]
...
~/.asdf/installs/python/3.8.8/lib/python3.8/site-packages/xarray/core/variable.py in _validate_indexers(self, key)
    697                         )
    698                     if k.ndim > 1:
--> 699                         raise IndexError(
    700                             ""{}-dimensional boolean indexing is ""
    701                             ""not supported. "".format(k.ndim)

IndexError: 2-dimensional boolean indexing is not supported.
```

Current proposal (""`stack`""), of `da[key]` and with a dimension of `key`'s name (and probably no multiindex):
```python
In [86]: da.values[key.values]
Out[86]: array([0, 3, 6, 9])   # But the xarray version
```

Previous suggestion (""`where`""), for the result of `da[key]`:
```python
In [87]: da.where(key)
Out[87]:
<xarray.DataArray (a: 3, b: 4)>
array([[ 0., nan, nan,  3.],
       [nan, nan,  6., nan],
       [nan,  9., nan, nan]])
Dimensions without coordinates: a, b
```

(small follow up I'll put in another message, for clarity)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824329772,https://api.github.com/repos/pydata/xarray/issues/1887,824329772,MDEyOklzc3VlQ29tbWVudDgyNDMyOTc3Mg==,1217238,2021-04-21T20:16:10Z,2021-04-21T20:16:10Z,MEMBER,"> I've been trying to conceptualize why I think the `where` equivalence (the original proposal) is better than the `stack` proposal (the latter).

Here are two reasons why I like the `stack` version:

1. It's more NumPy like -- boolean indexing in NumPy returns a flat array in the same way
2. It doesn't need dtype promotion to handle possibly missing values, so it will have more predictable semantics.

As a side note: one nice feature of using `isel()` for stacking is that it _does not_ create a MultiIndex, which can be expensive. But there's no reason why we necessarily need to do that for `stack()`. I'll open a new issue to discuss adding an optional parameter.

> * I'm not sure how the setitem would work; `da[key] = value`?

To match the semantics of NumPy, `value` would need to have matching dims/coords to those of `da[key]`. In other words, it would also need to be stacked.

> * If someone wants the `stack` result, it's less work to do original -> `where` result -> `stack` result relative to original -> `stack` result -> `where` result; which suggests they're more composable?

I'm not quite sure this is true -- it's the difference between needing to call `stack()` vs `unstack()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-824299104,https://api.github.com/repos/pydata/xarray/issues/1887,824299104,MDEyOklzc3VlQ29tbWVudDgyNDI5OTEwNA==,5635139,2021-04-21T19:21:46Z,2021-04-21T19:21:46Z,MEMBER,"I've been trying to conceptualize why I think the `where` equivalence (the original proposal) is better than the `stack` proposal (the latter). I think it's mostly:
- It's simpler
- I'm not sure how the setitem would work; `da[key] = value`?
- If someone wants the `stack` result, it's less work to do original -> `where` result -> `stack` result relative to original -> `stack` result -> `where` result; which suggests they're more composable?

But I don't do much pointwise indexing — and so maybe we do want to prioritize that","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-823673654,https://api.github.com/repos/pydata/xarray/issues/1887,823673654,MDEyOklzc3VlQ29tbWVudDgyMzY3MzY1NA==,1217238,2021-04-20T23:50:34Z,2021-04-20T23:50:34Z,MEMBER,"It's worth noting that there is at least one other way boolean indexing could work:

- `ds[key]` could work like `ds.stack({key.name: key.dims}).isel({key.name: np.flatnonzero(key.data)})`, except without creating a MultiIndex. Arguably this might be more useful and also more consistent with NumPy itself. It's also more similar to the operation @Hoeze wants in https://github.com/pydata/xarray/issues/5179.

We can't support both with the same syntax, so we have to make a choice here :).

See also the discussion about what `drop_duplicates`/`unique` should do over in https://github.com/pydata/xarray/pull/5089.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734
https://github.com/pydata/xarray/issues/1887#issuecomment-803491524,https://api.github.com/repos/pydata/xarray/issues/1887,803491524,MDEyOklzc3VlQ29tbWVudDgwMzQ5MTUyNA==,5635139,2021-03-21T00:38:23Z,2021-03-21T00:38:23Z,MEMBER,"I've added the ""good first issue"" label — at least the first two bullets of the proposal would be relatively simple to implement, given they're mostly syntactic sugar.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,294241734