id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1173497454,I_kwDOAMm_X85F8iZu,6377,[FEATURE]: Add a replace method,13662783,open,0,,,8,2022-03-18T11:46:37Z,2023-06-25T07:52:46Z,,CONTRIBUTOR,,,,"### Is your feature request related to a problem?

If I have a DataArray of values:

```python
da = xr.DataArray([0, 1, 2, 3, 4, 5])
```
And I'd like to replace `to_replace=[1, 3, 5]` by `value=[10, 30, 50]`, there's no method `da.replace(to_replace, value)` to do this. 

There's no easy way like pandas (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html) to do this.

(Apologies if I've missed related issues, searching for ""replace"" gives many hits as the word is obviously used quite often.)

### Describe the solution you'd like

```python
da = xr.DataArray([0, 1, 2, 3, 4, 5])
replaced = da.replace([1, 3, 5], [10, 30, 50])
print(replaced)
```

```
<xarray.DataArray (dim_0: 6)>
array([ 0, 10,  2, 30,  4, 50])
Dimensions without coordinates: dim_0
```

I've had a try at a relatively efficient implementation below. I'm wondering whether it's a worthwhile addition to xarray?

### Describe alternatives you've considered

Ignoring issues such as dealing with NaNs, chunks, etc., a simple dict lookup:

```python
def dict_replace(da, to_replace, value):
    d = {k: v for k, v in zip(to_replace, value)}
    out = np.vectorize(lambda x: d.get(x, x))(da.values)
    return da.copy(data=out)
```

Alternatively, leveraging pandas:

```python
def pandas_replace(da, to_replace, value):
    df = pd.DataFrame()
    df[""values""] = da.values.ravel()
    df[""values""].replace(to_replace, value, inplace=True)
    return da.copy(data=df[""values""].values.reshape(da.shape))
```

But I also tried my hand at a custom implementation, letting `np.unique` do the heavy lifting:
```python
def custom_replace(da, to_replace, value):
    # Use np.unique to create an inverse index
    flat = da.values.ravel()
    uniques, index = np.unique(flat, return_inverse=True)    
    replaceable = np.isin(flat, to_replace)

    # Create a replacement array in which there is a 1:1 relation between
    # uniques and the replacement values, so that we can use the inverse index
    # to select replacement values. 
    valid = np.isin(to_replace, uniques, assume_unique=True)
    # Remove to_replace values that are not present in da. If no overlap
    # exists between to_replace and the values in da, just return a copy.
    if not valid.any():
        return da.copy()
    to_replace = to_replace[valid]
    value = value[valid]

    replacement = np.zeros_like(uniques)
    replacement[np.searchsorted(uniques, to_replace)] = value

    out = flat.copy()
    out[replaceable] = replacement[index[replaceable]]
    return da.copy(data=out.reshape(da.shape))
```

Such an approach seems like it's consistently the fastest:

```python
da = xr.DataArray(np.random.randint(0, 100, 100_000))
to_replace = np.random.choice(np.arange(100), 10, replace=False)
value = to_replace * 200

test1 = custom_replace(da, to_replace, value)
test2 = pandas_replace(da, to_replace, value)
test3 = dict_replace(da, to_replace, value)

assert test1.equals(test2)
assert test1.equals(test3)

# 6.93 ms ± 295 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit custom_replace(da, to_replace, value) 

# 9.37 ms ± 212 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit pandas_replace(da, to_replace, value) 

# 26.8 ms ± 1.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit dict_replace(da, to_replace, value) 
```

With the advantage growing the number of values involved:

```python
da = xr.DataArray(np.random.randint(0, 10_000, 100_000))
to_replace = np.random.choice(np.arange(10_000), 10_000, replace=False)
value = to_replace * 200

test1 = custom_replace(da, to_replace, value)
test2 = pandas_replace(da, to_replace, value)
test3 = dict_replace(da, to_replace, value)

assert test1.equals(test2)
assert test1.equals(test3)


# 21.6 ms ± 990 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit custom_replace(da, to_replace, value)
# 3.12 s ± 574 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit pandas_replace(da, to_replace, value)
# 42.7 ms ± 1.47 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit dict_replace(da, to_replace, value)
```

In my real-life example, with a DataArray of approx 110 000 elements, with 60 000 values to replace, the custom one takes 33 ms, the dict one takes 135 ms, while pandas takes 26 s (!).

### Additional context

In all cases, we need dealing with NaNs, checking the input, etc.:

```python
def replace(da: xr.DataArray, to_replace: Any, value: Any):
    from xarray.core.utils import is_scalar

    if is_scalar(to_replace):
        if not is_scalar(value):
            raise TypeError(""if to_replace is scalar, then value must be a scalar"")
        if np.isnan(to_replace):
            return da.fillna(value) 
        else:
            return da.where(da != to_replace, other=value)
    else:
        to_replace = np.asarray(to_replace)
        if to_replace.ndim != 1:
            raise ValueError(""to_replace must be 1D or scalar"")
        if is_scalar(value):
            value = np.full_like(to_replace, value)
        else:
            value = np.asarray(value)
            if to_replace.shape != value.shape:
                raise ValueError(
                    f""Replacement arrays must match in shape. ""
                    f""Expecting {to_replace.shape} got {value.shape} ""
                )
    
    _, counts = np.unique(to_replace, return_counts=True)
    if (counts > 1).any():
        raise ValueError(""to_replace contains duplicates"")
    
    # Replace NaN values separately, as they will show up as separate values
    # from numpy.unique.
    isnan = np.isnan(to_replace)
    if isnan.any():
        i = np.nonzero(isnan)[0]
        da = da.fillna(value[i])

    # Use np.unique to create an inverse index
    flat = da.values.ravel()
    uniques, index = np.unique(flat, return_inverse=True)    
    replaceable = np.isin(flat, to_replace)

    # Create a replacement array in which there is a 1:1 relation between
    # uniques and the replacement values, so that we can use the inverse index
    # to select replacement values. 
    valid = np.isin(to_replace, uniques, assume_unique=True)
    # Remove to_replace values that are not present in da. If no overlap
    # exists between to_replace and the values in da, just return a copy.
    if not valid.any():
        return da.copy()
    to_replace = to_replace[valid]
    value = value[valid]

    replacement = np.zeros_like(uniques)
    replacement[np.searchsorted(uniques, to_replace)] = value

    out = flat.copy()
    out[replaceable] = replacement[index[replaceable]]
    return da.copy(data=out.reshape(da.shape))
```

It think it should be easy to use e.g. let it operate on the numpy arrays so e.g. apply_ufunc will work.
The primary issue is whether values can be sorted; in such a case the dict lookup might be an okay fallback?
I've had a peek at the pandas implementation, but didn't become much wiser.

Anyway, for your consideration! I'd be happy to submit a PR.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6377/reactions"", ""total_count"": 9, ""+1"": 9, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
445745470,MDExOlB1bGxSZXF1ZXN0MjgwMTIwNzIz,2972,ENH: Preserve monotonic descending index order when merging,13662783,open,0,,,4,2019-05-18T19:12:11Z,2022-06-09T14:50:17Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/2972,"* Addresses GH2947

* When indexes were joined in a dataset merge, they would always get sorted in ascending order. This is awkward for geospatial grids, which are nearly always descending in the ""y"" coordinate.

* This also caused an inconsistency: when a merge is called on datasets with identical descending indexes, the resulting index is descending. When a merge is called with non-identical descending indexes, the resulting index is ascending.

* When indexes are mixed ascending and descending, or non-monotonic, the resulting index is still sorted in ascending order.

<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #2947
 - [x] Tests added
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

## Comments
I was doing some work and I kept running into the issue described at #2947, so I had a try at a fix. It was somewhat of a hassle to understand the issue because I kept running into seeming inconsistencies. This is caused by the fact that the joiner doesn't sort with a single index:

```python
def _get_joiner(join):
    if join == 'outer':
        return functools.partial(functools.reduce, operator.or_)
```
That makes sense, since I'm guessing `pandas.Index.union` isn't get called at all. (I still find the workings of  `functools` a little hard to infer.)

I also noticed that an outer join gets called with e.g. an `.isel` operation, even though there's only one index (so there's not really anything to join). However, skipping the join completely in that case makes several tests fail since dimension labels end up missing (I guess the `joiner` call takes care of it). 

It's just checking for the specific case now, but it feels like an very specific issue anyway...

The merge behavior is slightly different now, which is reflected in the updated test outcomes in `test_dataset.py`. These tests were turning monotonic decreasing indexes into an increasing index; now the decreasing order is maintained.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2972/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
620468256,MDU6SXNzdWU2MjA0NjgyNTY=,4076,Zarr ZipStore versus DirectoryStore: ZipStore requires .close(),13662783,open,0,,,4,2020-05-18T19:58:21Z,2022-04-28T22:37:48Z,,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->
I was saving my dataset into a ZipStore -- apparently succesfully -- but then I couldn't reopen them.

The issue appears to be that a regular DirectoryStore behaves a little differently: it doesn't need to be closed, while a ZipStore.

(I'm not sure how this relates to #2586, the remarks there don't appear to be applicable anymore.)

#### MCVE Code Sample

This errors:
```python
import xarray as xr
import zarr

# works as expected 
ds = xr.Dataset({'foo': [2,3,4], 'bar': ('x', [1, 2]), 'baz': 3.14})
ds.to_zarr(zarr.DirectoryStore(""test.zarr""))
print(xr.open_zarr(zarr.DirectoryStore(""test.zarr"")))

# error with ValueError ""group not found at path ''
ds.to_zarr(zarr.ZipStore(""test.zip""))
print(xr.open_zarr(zarr.ZipStore(""test.zip"")))
```

Calling close, or using `with` does the trick:

```python
store = zarr.ZipStore(""test2.zip"")
ds.to_zarr(store)
store.close()
print(xr.open_zarr(zarr.ZipStore(""test2.zip"")))

with zarr.ZipStore(""test3.zip"") as store:
     ds.to_zarr(store)
print(xr.open_zarr(zarr.ZipStore(""test3.zip"")))
```

#### Expected Output
I think it would be preferable to close the ZipStore in this case. But I might be missing something?

#### Problem Description
Because `to_zarr` works in this situation with a DirectoryStore, it's easy to assume a ZipStore will work similarly. However, I couldn't get it to read my data back in this case.

#### Versions

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 21:48:41) [MSC v.1916 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 158 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
libhdf5: 1.10.5
libnetcdf: 4.7.3

xarray: 0.15.2.dev41+g8415eefa.d20200419
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.3.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.2
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.14.0+23.gbea4c9a2
distributed: 2.14.0
matplotlib: 3.1.2
cartopy: None
seaborn: 0.10.0
numbagg: None
pint: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.3.4
IPython: 7.13.0


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4076/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
386596872,MDU6SXNzdWUzODY1OTY4NzI=,2587,"DataArray constructor still coerces to np.datetime64[ns], not cftime in 0.11.0",13662783,open,0,,,3,2018-12-02T20:34:36Z,2022-04-18T16:06:12Z,,CONTRIBUTOR,,,,"#### Code Sample
```python
import xarray as xr
import numpy as np
from datetime import datetime

time = [np.datetime64(datetime.strptime(""10000101"", ""%Y%m%d""))]
print(time[0])
print(np.dtype(time[0]))

da = xr.DataArray(time, (""time"",), {""time"":time})
print(da)
```
Results in:
```
1000-01-01T00:00:00.000000
datetime64[us]

<xarray.DataArray (time: 1)>
array(['2169-02-08T23:09:07.419103232'], dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 2169-02-08T23:09:07.419103232
```
#### Problem description
I was happy to see `cftime` as default in the release notes for 0.11.0:
> Xarray will now always use `cftime.datetime` objects, rather than by default trying to coerce them into `np.datetime64[ns]` objects. A `CFTimeIndex` will be used for indexing along time coordinates in these cases.

However, it seems that the DataArray constructor does not use `cftime` (yet?), and coerces to `np.datetime64[ns]` here:
https://github.com/pydata/xarray/blob/0d6056e8816e3d367a64f36c7f1a5c4e1ce4ed4e/xarray/core/variable.py#L183-L189

#### Expected Output
I think I'd expect `cftime.datetime` in this case as well. Some coercion happens anyway as pandas timestamps are turned into `np.datetime64[ns]`.

(But perhaps this was already on your radar, and am I just a little too eager!) 

#### Output of ``xr.show_versions()``

<details>

```
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

xarray: 0.11.0
pandas: 0.23.3
numpy: 1.15.3
scipy: 1.1.0
netCDF4: 1.3.1
h5netcdf: 0.6.1
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.0
PseudonetCDF: None
rasterio: 1.0.0
iris: None
bottleneck: 1.2.1
cyordereddict: None
dask: 0.19.2
distributed: 1.23.2
matplotlib: 2.2.2
cartopy: 0.16.0
seaborn: 0.9.0
setuptools: 40.5.0
pip: 18.1
conda: None
pytest: 3.6.3
IPython: 6.4.0
sphinx: 1.7.5
```

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2587/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
441341340,MDU6SXNzdWU0NDEzNDEzNDA=,2947,xr.merge always sorts indexes ascending,13662783,open,0,,,2,2019-05-07T17:06:06Z,2019-05-07T21:07:26Z,,CONTRIBUTOR,,,,"#### Code Sample, a copy-pastable example if possible

```python
import xarray as xr
import numpy as np


nrow, ncol = (4, 5)
dx, dy = (1.0, -1.0)
xmins = (0.0, 3.0, 3.0, 0.0)
xmaxs = (5.0, 8.0, 8.0, 5.0)
ymins = (0.0, 2.0, 0.0, 2.0)
ymaxs = (4.0, 6.0, 4.0, 6.0)
data = np.ones((nrow, ncol), dtype=np.float64)

das = []
for xmin, xmax, ymin, ymax in zip(xmins, xmaxs, ymins, ymaxs):
    kwargs = dict(
            name=""example"",
            dims=(""y"", ""x""),
            coords={""y"": np.arange(ymax, ymin, dy),
            ""x"": np.arange(xmin, xmax, dx)},
    )
    das.append(xr.DataArray(data, **kwargs))
    
xr.merge(das)

# This won't flip the coordinate:
xr.merge([das[0]))

```
#### Problem description
Let's say I have a number of geospatial grids that I'd like to merge (for example, loaded with `xr.open_rasterio`). To quote [https://www.perrygeo.com/python-affine-transforms.html](url)

> The typical geospatial coordinate reference system is defined on a cartesian plane with the 0,0 origin in the bottom left and X and Y increasing as you go up and to the right. But raster data, coming from its image processing origins, uses a different referencing system to access pixels. We refer to rows and columns with the 0,0 origin in the upper left and rows increase and you move down while the columns increase as you go right. Still a cartesian plane but not the same one.

`xr.merge` will alway return the result with ascending coordinates, which creates some issues / confusion later on if you try to write it back to a GDAL format, for example (I've been scratching my head for some time looking at upside-down .tifs).

#### Expected Output

I think the expected output for these geospatial grids is that; if you provide only DataArrays with positive dx, negative dy; that the merged result comes out with a positive dx and a negative dy as well.

When the DataArrays to merge are mixed in coordinate direction (some with ascending, some with descending coordinate values), defaulting to an ascending sort seems sensible.

#### A suggestion

I saw that the sort is occurring [here, in pandas](https://github.com/pandas-dev/pandas/blob/2bbc0c2c198374546408cb15fff447c1e306f99f/pandas/core/indexes/base.py#L2260-L2265); and that there's a `is_monotonic_decreasing` property in [pandas.core.indexes.base.Index](https://github.com/pandas-dev/pandas/blob/2bbc0c2c198374546408cb15fff447c1e306f99f/pandas/core/indexes/base.py#L1601)

I think this could work (it solves my issue at least), in [xarray.core.alignment](https://github.com/pydata/xarray/blob/5aaa6547cd14a713f89dfc7c22643d86fce87916/xarray/core/alignment.py#L125)
```python
                index = joiner(matching_indexes)
                if all(
                    (matching_index.is_monotonic_decreasing
                        for matching_index in matching_indexes)
                ):
                    index = index[::-1]
                joined_indexes[dim] = index
```
But I lack the knowledge to say whether this plays nice in all cases. And does `index[::-1]` return a view or a copy? (And does it matter?)

For reference this is what it looks like now:
```python
            if (any(not matching_indexes[0].equals(other)
                    for other in matching_indexes[1:]) or
                    dim in unlabeled_dim_sizes):
                if join == 'exact':
                    raise ValueError(
                        'indexes along dimension {!r} are not equal'
                        .format(dim))
                index = joiner(matching_indexes)
                joined_indexes[dim] = index
            else:
                index = matching_indexes[0]
```
It's also worth highlighting that the `else` branch causes, arguably, some inconsistency. If the indexes are equal, no reversion occurs.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2947/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue