home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1199122647

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1199122647 I_kwDOAMm_X85HeSjX 6466 drop and other shouldn't be mutually exclusive in DataWithCoords.where 3698640 closed 0     0 2022-04-10T17:42:07Z 2022-04-12T15:33:05Z 2022-04-12T15:33:05Z CONTRIBUTOR      

Is your feature request related to a problem?

xr.Dataset.where and xr.DataArray.where currently do not allow providing both other and drop=True, stating that they are mutually exclusive. Conceptually, this doesn't strike me as true. Drop does not drop all points where cond is False, only those where an entire coordinate label evaluates to False. This is most important when trying to avoid type promotion, such as in the below example, where an integer FillValue might be preferred over dtypes.NA.

```python In [2]: da = xr.DataArray(np.arange(16).reshape(4, 4), dims=['x', 'y'])

In [3]: da Out[3]: <xarray.DataArray (x: 4, y: 4)> array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y If *`other`* is not provided, the array is promoted to `float64` to accommodate the default,`dtypes.NA`:python In [4]: da.where(da > 6, drop=True) Out[4]: <xarray.DataArray (x: 3, y: 4)> array([[nan, nan, nan, 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) Dimensions without coordinates: x, y However, the combination of *`other`* and *`drop=True`* is not allowed:python In [5]: da.where(da > 6, -1, drop=True)


ValueError Traceback (most recent call last) Input In [5], in <module> ----> 1 da.where(da > 6, -1, drop=True)

File ~/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/xarray/core/common.py:1268, in DataWithCoords.where(self, cond, other, drop) 1266 if drop: 1267 if other is not dtypes.NA: -> 1268 raise ValueError("cannot set other if drop=True") 1270 if not isinstance(cond, (Dataset, DataArray)): 1271 raise TypeError( 1272 f"cond argument is {cond!r} but must be a {Dataset!r} or {DataArray!r}" 1273 )

ValueError: cannot set other if drop=True ```

Describe the solution you'd like

Current implementation

The current behavior is enforced within the block handling the drop argument (currently https://github.com/pydata/xarray/blob/main/xarray/core/common.py#L1266-L1268):

python if drop: if other is not dtypes.NA: raise ValueError("cannot set `other` if drop=True")

Proposed fix

I just removed the above if statement on a fork, and the example now works!

```python

import xarray as xr, numpy as np da = xr.DataArray(np.arange(16).reshape(4, 4), dims=['x', 'y']) da <xarray.DataArray (x: 4, y: 4)> array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y da.where(da > 6, drop=True) <xarray.DataArray (x: 3, y: 4)> array([[nan, nan, nan, 7.], [ 8., 9., 10., 11.], [12., 13., 14., 15.]]) Dimensions without coordinates: x, y da.where(da > 6, -1, drop=True) <xarray.DataArray (x: 3, y: 4)> array([[-1, -1, -1, 7], [ 8, 9, 10, 11], [12, 13, 14, 15]]) Dimensions without coordinates: x, y ``` Making this change fails a single test in https://github.com/pydata/xarray/blob/main/xarray/tests/test_dataset.py#L4548-L4549, which explicitly checks for this behavior:

python with pytest.raises(ValueError, match=r"cannot set"): ds.where(ds > 1, other=0, drop=True) This could easily be reworked to check for valid handling of both arguments.

Describe alternatives you've considered

No response

Additional context

I haven't yet investigated what would happen with chunked, sparse, or other complex arrays, or if it's compatible with trees and other things on the roadmap. It's possible this breaks things I'm not imagining. Currently, where(cond, other) and where(cond, drop=True) are well-tested, flexible operations, and I don't see why allowing their union would break anything, but I'll wait to hear from the experts on that front!

I'm definitely open to creating a pull request (and have the simple implementation I've outlined here ready to go).

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6466/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.666ms · About: xarray-datasette