id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
2276408691,I_kwDOAMm_X86Hrz1z,8995,Why does xr.apply_ufunc support numpy/dask.arrays?,35968931,open,0,,,0,2024-05-02T20:18:41Z,2024-05-03T22:03:43Z,,MEMBER,,,,"### What is your issue?

@keewis pointed out that it's weird that [`xarray.apply_ufunc`](https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html) supports passing numpy/dask arrays directly, and I'm inclined to agree. I don't understand why we do, and think we should consider removing that feature.

Two arguments in favour of removing it:

1) **It exposes users to transposition errors**

Consider this example:

```python
In [1]: import xarray as xr

In [2]: import numpy as np

In [3]: arr = np.arange(12).reshape(3, 4)

In [4]: def mean(obj, dim):
   ...:     # note: apply always moves core dimensions to the end
   ...:     return xr.apply_ufunc(
   ...:         np.mean, obj, input_core_dims=[[dim]], kwargs={""axis"": -1}
   ...:     )
   ...: 

In [5]: mean(arr, dim='time')
Out[5]: array([1.5, 5.5, 9.5])

In [6]: mean(arr.T, dim='time')
Out[6]: array([4., 5., 6., 7.])
```

Transposing the input leads to a different result, with the value of the `dim` kwarg effectively ignored. This kind of error is what xarray code is supposed to prevent by design.

2) **There is an alternative input pattern that doesn't require accepting bare arrays**

Instead, any numpy/dask array can just be wrapped up into an xarray `Variable`/`NamedArray` before passing it to `apply_ufunc`.

```python
In [7]: from xarray.core.variable import Variable

In [8]: var = Variable(data=arr, dims=['time', 'space'])

In [9]: mean(var, dim='time')
Out[9]: 
<xarray.Variable (space: 4)> Size: 32B
array([4., 5., 6., 7.])

In [10]: mean(var.T, dim='time')
Out[10]: 
<xarray.Variable (space: 4)> Size: 32B
array([4., 5., 6., 7.])
```

This now guards against the transposition error, and puts the onus on the user to be clear about which axes of their array correspond to which dimension.

With `Variable`/`NamedArray` as public API, this latter pattern can handle every case that passing bare arrays in could.

I suggest we deprecate accepting bare arrays in favour of having users wrap them in `Variable`/`NamedArray`/`DataArray` objects instead.

(Note 1: We also accept raw scalars, but this doesn't expose anyone to transposition errors.)

(Note 2: In a quick scan of the `apply_ufunc` docstring, the docs on it in `computation.rst`, and the extensive guide that @dcherian wrote in the xarray tutorial repository, I can't see any examples that actually pass bare arrays to `apply_ufunc`.)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8995/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2276352251,I_kwDOAMm_X86HrmD7,8994,Improving performance of open_datatree,35968931,open,0,,,4,2024-05-02T19:43:17Z,2024-05-03T15:25:33Z,,MEMBER,,,,"### What is your issue?

The implementation of `open_datatree` works, but is inefficient, because it calls `open_dataset` once for every group in the file. We should refactor this to improve the performance, which would fix issues like https://github.com/xarray-contrib/datatree/issues/330.

We discussed this in the [datatree meeting](https://github.com/pydata/xarray/issues/8747), and my understanding is that concretely we need to:

- [ ] Create an asv benchmark for `open_datatree`, probably involving first writing then benchmarking the opening of a special netCDF file that has no data but lots of groups.
- [ ] Refactor the [`NetCDFDatastore`](https://github.com/pydata/xarray/blob/748bb3a328a65416022ec44ced8d461f143081b5/xarray/backends/netCDF4_.py#L319) class to only create one `CachingFileManager` object per file, not one per group, see https://github.com/pydata/xarray/blob/748bb3a328a65416022ec44ced8d461f143081b5/xarray/backends/netCDF4_.py#L406.
- [ ] Refactor `NetCDF4BackendEntrypoint.open_datatree` to use an implementation that goes through `NetCDFDatastore` without calling the top-level `xr.open_dataset` again.
- [ ] Check the performance of calling `xr.open_datatree` on a netCDF file has actually improved.

It would be great to get this done soon as part of the datatree integration project. @kmuehlbauer I know you were interested - are you willing / do you have time to take this task on?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8994/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2054280736,I_kwDOAMm_X856cdYg,8572,Track merging datatree into xarray,35968931,open,0,,,27,2023-12-22T17:37:20Z,2024-05-02T19:44:29Z,,MEMBER,,,,"### What is your issue?

Master issue to track progress of merging [xarray-datatree](https://github.com/xarray-contrib/datatree) into xarray `main`. Would close https://github.com/pydata/xarray/issues/4118 (and many similar issues), as well as one of the goals of our [development roadmap](https://docs.xarray.dev/en/stable/roadmap.html#tree-like-data-structure).

Also see the [project board for DataTree integration](https://github.com/pydata/xarray/projects/9).

---

On calls in the last few [dev meetings](https://github.com/pydata/xarray/issues/4001), we decided to forget about a temporary cross-repo `from xarray import datatree` (so this issue supercedes #7418), and just begin merging datatree into xarray main directly.

## Weekly meeting

See https://github.com/pydata/xarray/issues/8747

## Task list:

To happen in order:

- [x] **`open_datatree` in xarray.** This doesn't need to be performant initially, and ~~it would initially return a `datatree.DataTree` object.~~ EDIT: We decided it should return an `xarray.DataTree` object, or even `xarray.core.datatree.DataTree` object. So we can start by just copying the basic version in `datatree/io.py` right now which just calls `open_dataset` many times. #8697
- [x] **Triage and fix issues**: figure out which of the issues on xarray-contrib/datatree need to be fixed *before* the merge (if any).
- [ ] **Merge in code for `DataTree` class.** I suggest we do this by making one PR for each module, and ideally discussing and merging each before opening a PR for the next module. (Open to other workflow suggestions though.) The main aim here being lowering the bus factor on the code, confirming high-level design decisions, and improving details of the implementation as it goes in.
    
    Suggested order of modules to merge:
    - [x] `datatree/treenode.py` - defines the tree structure, without any dimensions/data attached, #8757
    - [x] `datatree/datatree.py` - adds data to the tree structure, #8789
    - [x] `datatree/iterators.py` - iterates over a single tree in various ways, currently copied from [anytree](https://github.com/c0fec0de/anytree), #8879
    - [x] `datatree/mapping.py` - implements `map_over_subtree` by iterating over N trees at once https://github.com/pydata/xarray/pull/8948,
    - [ ] `datatree/ops.py` - uses `map_over_subtree` to map methods like `.mean` over whole trees (https://github.com/pydata/xarray/pull/8976),
    - [x] `datatree/formatting_html.py` - HTML repr, works but could do with some [optimization](https://github.com/xarray-contrib/datatree/issues/206) https://github.com/pydata/xarray/pull/8930,
    - [x] `datatree/{extensions/common}.py` - miscellaneous other features e.g. attribute-like access (#8967).

- [ ] **Expose datatree API publicly.** Actually expose `open_datatree` and `DataTree` in xarray's public API as top-level imports. The full list of things to expose is:
  - [ ] `open_datatree`
  - [ ] `DataTree`
  - [ ] `map_over_subtree`
  - [ ] `assert_isomorphic`
  - [ ] `register_datatree_accessor`

- [ ] **Refactor class inheritance** - `Dataset`/`DataArray` share some mixin classes (e.g. `DataWithCoords`), and we could probably refactor `DataTree` to use these too. This is low-priority but would reduce code duplication.

Can happen basically at any time or maybe in parallel with other efforts:

- [ ] **Generalize backends to support groups.** Once a basic version of `xr.open_datatree` exists, we can start refactoring xarray's backend classes to support a general `Backend.open_datatree` method for any backend that can open multiple groups. Then we can make sure this is more performant than the naive implementation, i.e. only opening the file once. See also #8994.
- [ ] **Support backends other than netCDF and Zarr.** - e.g. grib, see https://github.com/pydata/xarray/pull/7437,
- [ ] **Support dask properly** - Issue https://github.com/xarray-contrib/datatree/pull/97 and the (stale) PR https://github.com/xarray-contrib/datatree/pull/196 are about dask parallelization over separate nodes in the tree.
- [ ] **Add other new high-level API methods** - Things like [`.reorder_nodes`](https://github.com/xarray-contrib/datatree/pull/271) and ideas we've only discussed like https://github.com/xarray-contrib/datatree/issues/79 and https://github.com/xarray-contrib/datatree/issues/254 (cc @dcherian who has had useful ideas here)
- [ ] **Copy xarray-contrib/datatree issues over to xarray's main repository.** I think this is quite important and worth doing as a record of why decisions were made. (@jhamman and @TomNicholas)
- [ ] Copy over any recent bug fixes from original `datatree` repository
- [x] **Look into merging commit history of xarray-contrib/datatree.** I think this would be cool but is less important than keeping the issues. (@jhamman suggested we could do this using some git wizardry that I hadn't heard of before)
- [ ] **`xarray.tutorial.open_datatree`** - I've been meaning to make a tutorial datatree object for ages. There's an [issue about it](https://github.com/xarray-contrib/datatree/issues/100), but actually now I think something close to the CMIP6 ensemble data that @jbusecke and I used in our [pangeo blog post](https://medium.com/pangeo/easy-ipcc-part-1-multi-model-datatree-469b87cf9114) would already be pretty good. Once we have this it becomes much easier to write docs about some advanced features.
- [ ] **Merge Docs** - I've tried to write these pages so that they should slot neatly into xarray's existing docs structure. Careful reading, additions and improvements would be great though. Summary of what docs exist on this issue https://github.com/xarray-contrib/datatree/issues/61
- [ ] Write a blog post on the [xarray blog](https://xarray.dev/blog) highlighting xarray's new functionality, and explicitly thanking the NASA team for their work. Doesn't have to be long, it can just point to the documentation.

---

Anyone is welcome to help with any of this, including but not limited to  @owenlittlejohns , @eni-awowale, @flamingbear (@etienneschalk maybe?).

cc also @shoyer @keewis for any thoughts as to the process.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8572/reactions"", ""total_count"": 7, ""+1"": 6, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,,13221727,issue
2019566184,I_kwDOAMm_X854YCJo,8494,Filter expected warnings in the test suite,35968931,closed,0,,,1,2023-11-30T21:50:15Z,2024-04-29T16:57:07Z,2024-04-29T16:56:16Z,MEMBER,,,,"FWIW one thing I'd be keen for to do generally — though maybe this isn't the place to start it — is handle warnings in the test suite when we add a new warning — i.e. filter them out where we expect them.

In this case, that would be the loading the netCDF files that have duplicate dims. 

Otherwise warnings become a huge block of text without much salience. I mostly see the 350 lines of them and think ""meh mostly units & cftime"", but then something breaks on a new upstream release that was buried in there, or we have a supported code path that is raising warnings internally.

(I'm not sure whether it's possible to generally enforce that — maybe we could raise on any warnings coming from within xarray? Would be a non-trivial project to get us there though...)

_Originally posted by @max-sixty in https://github.com/pydata/xarray/issues/8491#issuecomment-1834615826_
            ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8494/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2253567622,I_kwDOAMm_X86GUraG,8959,Dataset constructor always coerces 1D data variables with same name as dim to coordinates,35968931,open,0,,,10,2024-04-19T17:54:28Z,2024-04-28T19:57:31Z,,MEMBER,,,,"### What is your issue?

Whilst xarray's data model appears to allow 1D data variables that have the same name as their dimension, it seems to be impossible to actually create this using the `Dataset` constructor, as they will always be converted to coordinate variables instead.

We can create a 1D data variable with the same name as it's dimension like this:
```python
In [9]: ds = xr.Dataset({'x': 0})

In [10]: ds
Out[10]: 
<xarray.Dataset> Size: 8B
Dimensions:  ()
Data variables:
    x        int64 8B 0

In [11]: ds.expand_dims('x')
Out[11]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Dimensions without coordinates: x
Data variables:
    x        (x) int64 8B 0
```
so it seems to be a valid part of the data model.

But I can't get to that situation from the `Dataset` constructor. This should create the same dataset:

```python
In [15]: ds = xr.Dataset(data_vars={'x': ('x', [0])})

In [16]: ds
Out[16]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Coordinates:
  * x        (x) int64 8B 0
Data variables:
    *empty*
```
But actually it makes `x` a coordinate variable (and implicitly creates a pandas Index for it). This means that in this case there is no difference between using the `data_vars` and `coords` kwargs to the constructor:

```python
ds = xr.Dataset(coords={'x': ('x', [0])})

In [18]: ds
Out[18]: 
<xarray.Dataset> Size: 8B
Dimensions:  (x: 1)
Coordinates:
  * x        (x) int64 8B 0
Data variables:
    *empty*
```

This all seems weird to me. I would have thought that if a 1D data variable is allowed, we shouldn't coerce to making it a coordinate variable in the constructor. If anything that's actively misleading.

Note that whilst this came up in the context of trying to avoid auto-creation of 1D indexes for coordinate variables, this issue is actually separate. (xref https://github.com/pydata/xarray/pull/8872#issuecomment-2027571714)

cc @benbovy who probably has thoughts","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8959/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2224036575,I_kwDOAMm_X86EkBrf,8905,Variable doesn't have an .expand_dims method,35968931,closed,0,,,4,2024-04-03T22:19:10Z,2024-04-28T19:54:08Z,2024-04-28T19:54:08Z,MEMBER,,,,"### Is your feature request related to a problem?

`DataArray` and `Dataset` have an `.expand_dims` method, but it looks like `Variable` doesn't. 

### Describe the solution you'd like

Variable should also have this method, the only difference being that it wouldn't create any coordinates or indexes.

### Describe alternatives you've considered

_No response_

### Additional context

_No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8905/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2204768593,I_kwDOAMm_X86DahlR,8871,Concatenation automatically creates indexes where none existed,35968931,open,0,,,1,2024-03-25T02:43:31Z,2024-04-27T16:50:56Z,,MEMBER,,,,"### What happened?

Currently concatenation will automatically create indexes for any dimension coordinates in the output, even if there were no indexes on the input.

### What did you expect to happen?

Indexes not to be created for variables which did not already have them.

### Minimal Complete Verifiable Example

```Python
# TODO once passing indexes={} directly to DataArray constructor is allowed then no need to create coords object separately first
coords = Coordinates(
    {""x"": np.array([1, 2, 3])}, indexes={}
)
arrays = [
    DataArray(
        np.zeros((3, 3)),
        dims=[""x"", ""y""],
        coords=coords,
    )
    for _ in range(2)
]

combined = concat(arrays, dim=""x"")
assert combined.shape == (6, 3)
assert combined.dims == (""x"", ""y"")

# should not have auto-created any indexes
assert combined.indexes == {}  # this fails

combined = concat(arrays, dim=""z"")
assert combined.shape == (2, 3, 3)
assert combined.dims == (""z"", ""x"", ""y"")

# should not have auto-created any indexes
assert combined.indexes == {}  # this also fails
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

```Python
# nor have auto-created any indexes
>       assert combined.indexes == {}
E       AssertionError: assert Indexes:\n    x        Index([1, 2, 3, 1, 2, 3], dtype='int64', name='x') == {}
E         Full diff:
E         - {
E         -  ,
E         - }
E         + Indexes:
E         +     x        Index([1, 2, 3, 1, 2, 3], dtype='int64', name='x',
E         + )
```


### Anything else we need to know?

The culprit is [the call](https://github.com/pydata/xarray/blob/6af547cdd9beac3b18420ccb204f801603e11519/xarray/core/merge.py#L362) to `core.indexes.create_default_index_implicit` inside `merge.py`. If I comment out this call my concat test passes, but basic tests in `test_merge.py` start failing. 

I would like know to how to avoid the internal call to `create_default_index_implicit`. I tried passing `compat='override'` but that made no difference, so I think we would have to change `merge.collect_variables_and_indexes` somehow.

Conceptually, I would have thought we should be examining what indexes exist on the objects to be concatenated, and not creating new indexes for any variable that doesn't already have one. Presumably we should therefore be making use of the `indexes` argument to `merge.collect_variables_and_indexes`, but currently that just seems to be empty.

### Environment

I've been experimenting running this test on a branch that includes both #8711 and #8714, but actually this example will fail in the same way on `main`.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8871/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2259850888,I_kwDOAMm_X86GspaI,8966,HTML repr for chunked variables with high dimensionality,35968931,open,0,,,1,2024-04-23T22:00:40Z,2024-04-24T13:27:05Z,,MEMBER,,,,"### What is your issue?

The graphical representation of dask arrays with many dimensions can end up off the page in the HTML repr.

<img width=""839"" alt=""Screenshot 2024-04-23 at 6 00 04 PM"" src=""https://github.com/pydata/xarray/assets/35968931/81e86c80-1ab6-4136-bf3f-dce4475abf06"">

Ideally dask would worry about this for us, and we just use their `_inline_repr`, as mentioned here https://github.com/pydata/xarray/issues/4376#issuecomment-680296332","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8966/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1692904446,I_kwDOAMm_X85k56v-,7810,Generalize dask.delayed calls to go through ChunkManager,35968931,open,0,,,0,2023-05-02T18:30:32Z,2024-04-23T17:38:58Z,,MEMBER,,,,"> [Deepak: Should we add `chunked_array_type` and `from_array_kwargs` to `open_mfdataset`?

I actually don't think we need to - `from_array_kwargs` is only going to get directly passed down to `open_dataset`, and hence could be considered part of `**kwargs`.

This should actually just work, except in the case of `parallel=True`. For that we could add `delayed` to the `ChunkManager` ABC, so that if cubed does implement `cubed.delayed` it could be added, else a `NotImplementedError` would be raised. I think all of this wouldn't be necessary if we had lazy concatenation in xarray though (xref https://github.com/pydata/xarray/issues/4628). That suggestion would mean we should also replace other instances of `dask.delayed` in other parts of the codebase though... I think I will split this into a separate issue in the interests of getting this one merged.

_Originally posted by @TomNicholas in https://github.com/pydata/xarray/pull/7019#discussion_r1182904134_
            ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7810/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2134951079,I_kwDOAMm_X85_QMSn,8747,Datatree design discussions - weekly meeting,35968931,open,0,,,10,2024-02-14T18:39:16Z,2024-04-18T22:09:16Z,,MEMBER,,,,"### What is your issue?

In the [bi-weekly dev meeting](https://github.com/pydata/xarray/issues/4001) today we agreed that deliberate higher-level discussions of datatree's design would be useful. (i.e. we're not worried about our ability to write high-quality code, so let's focus review time more explicitly on the high-level design questions.)

This could take the form of me just talking through what I did in a certain part of the code and why, or a targeted discussion on specific design questions that I was never quite sure about. Some examples of the latter, as food for thought:
- [ ] Inheritance of dimension coordinates from parent nodes? https://github.com/xarray-contrib/datatree/issues/297
- [x] ~~Symbolic links? https://github.com/xarray-contrib/datatree/issues/5~~ (we decided this was overkill)
- [ ] Is `dt.ds` ugly? See also the difference between `dt.ds` and `dt.to_dataset()` https://github.com/xarray-contrib/datatree/issues/303#issuecomment-1917798769
- [ ] Which methods should map over the subtree and which shouldn't? (can't find the issue for this one)
- [ ] Ignore missing dims when mapping over subtree? https://github.com/xarray-contrib/datatree/issues/67
- [ ] API for sub-tree selection https://github.com/xarray-contrib/datatree/issues/254
- [ ] API for merging leaves https://github.com/xarray-contrib/datatree/issues/192
- [ ] Dict-like interface ambiguities https://github.com/xarray-contrib/datatree/issues/240
- [ ] The tree broadcasting rabbit hole https://github.com/xarray-contrib/datatree/issues/199
- [ ] Relationship between datatree and catalogs https://github.com/xarray-contrib/datatree/issues/134
- [ ] Should `xr.concat`/`xr.merge` accept `DataTree` objects? (and map over them by default?) Would help with https://github.com/TomNicholas/VirtualiZarr/issues/84#issuecomment-2065410549

There was also this [design doc](https://docs.google.com/document/d/19jVW5lL2jwhS0dgj9XqPBrcvIa13cpWrnDsnVLqZkfc/edit?usp=sharing) I wrote at one point

@flamingbear are you free at 11:30am EST on Tuesday each week? @shoyer, @keewis and I are all free then. Others also welcome (e.g. @owenlittlejohns , @eni-awowale, @etienneschalk), but not required :)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8747/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2247043809,I_kwDOAMm_X86F7yrh,8949,Mapping DataTree methods over nodes with variables for which the args are invalid,35968931,open,0,,,0,2024-04-16T23:45:26Z,2024-04-17T14:58:14Z,,MEMBER,,,,"### What is your issue?

In the datatree call today we narrowed down an issue with how datatree maps methods over many variables in many nodes. This issue is essentially https://github.com/xarray-contrib/datatree/issues/67, but I'll attempt to discuss the problem and solution in more general terms.

### Context in xarray

`xarray.Dataset` is essentially a mapping of variable names to `Variable` objects, and most `Dataset` methods implicitly map a method defined on Variable over all these variables (e.g. `.mean()`). Sometimes the mapped method can be naively applied to every variable in the dataset, but sometimes it doesn't make sense to apply it to some of the variables. For example `.mean(dim='time')` only makes sense for the variables in the dataset that actually have a `time` dimension.

`xarray.Dataset` handles this for the user by either working out what version of the method does make sense for that variable (e.g. only trying to take the mean along the reduction dimensions actually present on that variable), or just passing the variable through unaltered. There are some weird subtleties lurking here, e.g. with statistical reductions like `std` and `var`.

https://github.com/pydata/xarray/blob/239309f881ba0d7e02280147bc443e6e286e6a63/xarray/core/dataset.py#L6853

There is therefore a difference between

`ds.map(Variable.{REDUCTION}, dim='time')` and `ds.{REDUCTION}(dim='time')`

For example:

```python
In [13]: ds = xr.Dataset({'a': ('x', [1, 2]), 'b': 0})

In [14]: ds.isel(x=0)
Out[14]: 
<xarray.Dataset> Size: 16B
Dimensions:  ()
Data variables:
    a        int64 8B 1
    b        int64 8B 0

In [15]: ds.map(Variable.isel, x=0)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 ds.map(Variable.isel, x=0)

...

ValueError: Dimensions {'x'} do not exist. Expected one or more of ()
```

(Aside: It would be nice for `Dataset.map` to include information about which variable it raised an exception on in the error message.)

Clearly `Dataset.isel` does more than just applying `Variable.isel` using `Dataset.map`. 

### Issue in DataTree

In datatree we have to map methods over different variables in the same node, but also over different variables in different nodes. Currently the implementation of a method naively maps the `Dataset` method over every node using `map_over_subtree`, but if there is a node containing a variable for which the method args are invalid, it will raise an exception.

This causes problems for users, for example in https://github.com/xarray-contrib/datatree/issues/67. A minimal example of this problem would be

```python
In [18]: ds1 = xr.Dataset({'a': ('x', [1, 2])})

In [19]: ds2 = xr.Dataset({'b': 0})

In [20]: dt = DataTree.from_dict({'node1': ds1, 'node2': ds2})

In [21]: dt
Out[21]: 
DataTree('None', parent=None)
├── DataTree('node1')
│       Dimensions:  (x: 2)
│       Dimensions without coordinates: x
│       Data variables:
│           a        (x) int64 16B 1 2
└── DataTree('node2')
        Dimensions:  ()
        Data variables:
            b        int64 8B 0

In [22]: dt.isel(x=0)
```
```
ValueError: Dimensions {'x'} do not exist. Expected one or more of FrozenMappingWarningOnValuesAccess({})
Raised whilst mapping function over node with path /node2
```

(The slightly weird error message here is related to the deprecation cycle in #8500)

We would have preferred that variable `b` in `node2` survived unchanged, like it does in the pure `Dataset` example.

### Desired behaviour

We can _kind_ of think of the desired behaviour like a hypothesis property we want (xref https://github.com/pydata/xarray/issues/1846), but not quite. It would be something like

```python
dt.{REDUCTION}().flatten_into_dataset() == dt.flatten_into_dataset().{REDUCTION}()
```

except that `.flatten_into_dataset()` can't really exist for all cases otherwise we wouldn't need datatree.

### Proposed Solution

There are two ways I can imagine implementing this. 
1) Use `map_over_subtree` the apply the method as-is and try to catch known possible `KeyErrors` for missing dimensions. This would be fragile.
2) Do some kind of pre-checking of the data in the tree, potentially adjust the method before applying it using `map_over_subtree`.

I think @shoyer and I concluded that we should make (2), in the form of some kind of new primitive, i.e. `DataTree.reduce`. (Actually `DataTree.reduce` already exists, but should be changed to not just `map_over_subtree` `Dataset.reduce`). Taking after `Dataset.reduce`, it would look something like this:

```python
class DataTree:
    def reduce(self, reduce_func: Callable, dim: Dims = None, *, **kwargs) -> DataTree:
        all_dims_in_tree = set(node.dims for node in self.subtree)

        missing_dims = tuple(d for d in dims if d not in all_dims_in_tree)
        if missing_dims:
            raise ValueError()

        # TODO this could probably be refactored to call `map_over_subtree`
        for node in self.subtree:
            # using only the reduction dims that are actually present here would fix datatree GH issue #67
            reduce_dims = [d for d in node.dims if d in dims]
            result = node.ds.reduce(func, dims=reduce_dims, **kwargs)

        # TODO build the result and return it
```

Then every method that has this pattern of acting over one or more dims should be mapped over the tree using `DataTree.reduce`, not `map_over_subtree`.

cc @shoyer, @flamingbear, @owenlittlejohns ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8949/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2198196326,I_kwDOAMm_X86DBdBm,8860,Ugly error in constructor when no data passed,35968931,closed,0,,,2,2024-03-20T17:55:52Z,2024-04-10T22:46:55Z,2024-04-10T22:46:54Z,MEMBER,,,,"### What happened?

Passing no data to the `Dataset` constructor can result in a very unhelpful ""tuple index out of range"" error when this is a clear case of malformed input that we should be able to catch.

### What did you expect to happen?

An error more like ""tuple must be of form (dims, data[, attrs])""

### Minimal Complete Verifiable Example

```Python
xr.Dataset({""t"": ()})
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

```Python
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 xr.Dataset({""t"": ()})

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:693, in Dataset.__init__(self, data_vars, coords, attrs)
    690 if isinstance(coords, Dataset):
    691     coords = coords._variables
--> 693 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
    694     data_vars, coords
    695 )
    697 self._attrs = dict(attrs) if attrs else None
    698 self._close = None

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:422, in merge_data_and_coords(data_vars, coords)
    418     coords = create_coords_with_default_indexes(coords, data_vars)
    420 # exclude coords from alignment (all variables in a Coordinates object should
    421 # already be aligned together) and use coordinates' indexes to align data_vars
--> 422 return merge_core(
    423     [data_vars, coords],
    424     compat=""broadcast_equals"",
    425     join=""outer"",
    426     explicit_coords=tuple(coords),
    427     indexes=coords.xindexes,
    428     priority_arg=1,
    429     skip_align_args=[1],
    430 )

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:718, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value, skip_align_args)
    715 for pos, obj in skip_align_objs:
    716     aligned.insert(pos, obj)
--> 718 collected = collect_variables_and_indexes(aligned, indexes=indexes)
    719 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
    720 variables, out_indexes = merge_collected(
    721     collected, prioritized, compat=compat, combine_attrs=combine_attrs
    722 )

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:358, in collect_variables_and_indexes(list_of_mappings, indexes)
    355     indexes_.pop(name, None)
    356     append_all(coords_, indexes_)
--> 358 variable = as_variable(variable, name=name, auto_convert=False)
    359 if name in indexes:
    360     append(name, variable, indexes[name])

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:126, in as_variable(obj, name, auto_convert)
    124     obj = obj.copy(deep=False)
    125 elif isinstance(obj, tuple):
--> 126     if isinstance(obj[1], DataArray):
    127         raise TypeError(
    128             f""Variable {name!r}: Using a DataArray object to construct a variable is""
    129             "" ambiguous, please extract the data using the .data property.""
    130         )
    131     try:

IndexError: tuple index out of range
```


### Anything else we need to know?

_No response_

### Environment

Xarray `main`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8860/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2212186122,I_kwDOAMm_X86D20gK,8883,Coordinates object permits invalid state,35968931,closed,0,,,2,2024-03-28T01:49:21Z,2024-03-28T16:28:11Z,2024-03-28T16:28:11Z,MEMBER,,,,"### What happened?

It is currently possible to create a `Coordinates` object where a variable shares a name with a dimension, but the variable is not 1D. This is explicitly forbidden by the xarray data model.


### What did you expect to happen?

If you try to pass the resulting object into the `Dataset` constructor you get the expected error telling you that this is forbidden, but that error should have been raised by `Coordinates.__init__`.

### Minimal Complete Verifiable Example

```Python
In [1]: from xarray.core.coordinates import Coordinates

In [2]: from xarray.core.variable import Variable

In [4]: import numpy as np

In [5]: var = Variable(data=np.arange(6).reshape(2, 3), dims=['x', 'y'])

In [6]: var
Out[6]: 
<xarray.Variable (x: 2, y: 3)> Size: 48B
array([[0, 1, 2],
       [3, 4, 5]])

In [7]: coords = Coordinates(coords={'x': var}, indexes={})

In [8]: coords
Out[8]: 
Coordinates:
    x        (x, y) int64 48B 0 1 2 3 4 5

In [10]: import xarray as xr

In [11]: ds = xr.Dataset(coords=coords)
---------------------------------------------------------------------------
MergeError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 ds = xr.Dataset(coords=coords)

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:693, in Dataset.__init__(self, data_vars, coords, attrs)
    690 if isinstance(coords, Dataset):
    691     coords = coords._variables
--> 693 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
    694     data_vars, coords
    695 )
    697 self._attrs = dict(attrs) if attrs else None
    698 self._close = None

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:422, in merge_data_and_coords(data_vars, coords)
    418     coords = create_coords_with_default_indexes(coords, data_vars)
    420 # exclude coords from alignment (all variables in a Coordinates object should
    421 # already be aligned together) and use coordinates' indexes to align data_vars
--> 422 return merge_core(
    423     [data_vars, coords],
    424     compat=""broadcast_equals"",
    425     join=""outer"",
    426     explicit_coords=tuple(coords),
    427     indexes=coords.xindexes,
    428     priority_arg=1,
    429     skip_align_args=[1],
    430 )

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:731, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value, skip_align_args)
    729     coord_names.intersection_update(variables)
    730 if explicit_coords is not None:
--> 731     assert_valid_explicit_coords(variables, dims, explicit_coords)
    732     coord_names.update(explicit_coords)
    733 for dim, size in dims.items():

File ~/Documents/Work/Code/xarray/xarray/core/merge.py:577, in assert_valid_explicit_coords(variables, dims, explicit_coords)
    575 for coord_name in explicit_coords:
    576     if coord_name in dims and variables[coord_name].dims != (coord_name,):
--> 577         raise MergeError(
    578             f""coordinate {coord_name} shares a name with a dataset dimension, but is ""
    579             ""not a 1D variable along that dimension. This is disallowed ""
    580             ""by the xarray data model.""
    581         )

MergeError: coordinate x shares a name with a dataset dimension, but is not a 1D variable along that dimension. This is disallowed by the xarray data model.
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

I noticed this whilst working on #8872

### Environment

`main`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8883/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2117248281,I_kwDOAMm_X85-MqUZ,8704,Currently no way to create a Coordinates object without indexes for 1D variables,35968931,closed,0,,,4,2024-02-04T18:30:18Z,2024-03-26T13:50:16Z,2024-03-26T13:50:15Z,MEMBER,,,,"### What happened?

The workaround described in https://github.com/pydata/xarray/pull/8107#discussion_r1311214263 does not seem to work on `main`, meaning that I think there is currently no way to create an `xr.Coordinates` object without 1D variables being coerced to indexes. This means there is no way to create a `Dataset` object without 1D variables becoming `IndexVariables` being coerced to indexes.

### What did you expect to happen?

I expected to at least be able to use the workaround described in https://github.com/pydata/xarray/pull/8107#discussion_r1311214263, i.e.

```python
xr.Coordinates({'x': ('x', uarr)}, indexes={})
```
where `uarr` is an un-indexable array-like.

### Minimal Complete Verifiable Example

```Python
class UnindexableArrayAPI:
    ...


class UnindexableArray:
    """"""
    Presents like an N-dimensional array but doesn't support changes of any kind, 
    nor can it be coerced into a np.ndarray or pd.Index.
    """"""
    
    _shape: tuple[int, ...]
    _dtype: np.dtype
    
    def __init__(self, shape: tuple[int, ...], dtype: np.dtype) -> None:
        self._shape = shape
        self._dtype = dtype
        self.__array_namespace__ = UnindexableArrayAPI

    @property
    def dtype(self) -> np.dtype:
        return self._dtype
    
    @property
    def shape(self) -> tuple[int, ...]:
        return self._shape
    
    @property
    def ndim(self) -> int:
        return len(self.shape)

    @property
    def size(self) -> int:
        return np.prod(self.shape)

    @property
    def T(self) -> Self:
        raise NotImplementedError()

    def __repr__(self) -> str:
        return f""UnindexableArray(shape={self.shape}, dtype={self.dtype})""

    def _repr_inline_(self, max_width):
        """"""
        Format to a single line with at most max_width characters. Used by xarray.
        """"""
        return self.__repr__()

    def __getitem__(self, key, /) -> Self:
        """"""
        Only supports extremely limited indexing.
        
        I only added this method because xarray will apparently attempt to index into its lazy indexing classes even if the operation would be a no-op anyway.
        """"""
        from xarray.core.indexing import BasicIndexer
        
        if isinstance(key, BasicIndexer) and key.tuple == ((slice(None),) * self.ndim):
            # no-op
            return self
        else:
            raise NotImplementedError()

    def __array__(self) -> np.ndarray:
        raise NotImplementedError(""UnindexableArrays can't be converted into numpy arrays or pandas Index objects"")
```

```python
uarr = UnindexableArray(shape=(3,), dtype=np.dtype('int32'))

xr.Variable(data=uarr, dims=['x'])  # works fine

xr.Coordinates({'x': ('x', uarr)}, indexes={})  # works in xarray v2023.08.0
```
but in versions after that it triggers the NotImplementedError in `__array__`:
```python
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[59], line 1
----> 1 xr.Coordinates({'x': ('x', uarr)}, indexes={})

File ~/Documents/Work/Code/xarray/xarray/core/coordinates.py:301, in Coordinates.__init__(self, coords, indexes)
    299 variables = {}
    300 for name, data in coords.items():
--> 301     var = as_variable(data, name=name)
    302     if var.dims == (name,) and indexes is None:
    303         index, index_vars = create_default_index_implicit(var, list(coords))

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:159, in as_variable(obj, name)
    152     raise TypeError(
    153         f""Variable {name!r}: unable to convert object into a variable without an ""
    154         f""explicit list of dimensions: {obj!r}""
    155     )
    157 if name is not None and name in obj.dims and obj.ndim == 1:
    158     # automatically convert the Variable into an Index
--> 159     obj = obj.to_index_variable()
    161 return obj

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:572, in Variable.to_index_variable(self)
    570 def to_index_variable(self) -> IndexVariable:
    571     """"""Return this variable as an xarray.IndexVariable""""""
--> 572     return IndexVariable(
    573         self._dims, self._data, self._attrs, encoding=self._encoding, fastpath=True
    574     )

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:2642, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath)
   2640 # Unlike in Variable, always eagerly load values into memory
   2641 if not isinstance(self._data, PandasIndexingAdapter):
-> 2642     self._data = PandasIndexingAdapter(self._data)

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:1481, in PandasIndexingAdapter.__init__(self, array, dtype)
   1478 def __init__(self, array: pd.Index, dtype: DTypeLike = None):
   1479     from xarray.core.indexes import safe_cast_to_index
-> 1481     self.array = safe_cast_to_index(array)
   1483     if dtype is None:
   1484         self._dtype = get_valid_numpy_dtype(array)

File ~/Documents/Work/Code/xarray/xarray/core/indexes.py:469, in safe_cast_to_index(array)
    459             emit_user_level_warning(
    460                 (
    461                     ""`pandas.Index` does not support the `float16` dtype.""
   (...)
    465                 category=DeprecationWarning,
    466             )
    467             kwargs[""dtype""] = ""float64""
--> 469     index = pd.Index(np.asarray(array), **kwargs)
    471 return _maybe_cast_to_cftimeindex(index)

Cell In[55], line 63, in UnindexableArray.__array__(self)
     62 def __array__(self) -> np.ndarray:
---> 63     raise NotImplementedError(""UnindexableArrays can't be converted into numpy arrays or pandas Index objects"")

NotImplementedError: UnindexableArrays can't be converted into numpy arrays or pandas Index objects
```


### MVCE confirmation

- [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [x] Complete example — the example is self-contained, including all data and the text of any traceback.
- [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [x] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [x] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

Context is #8699

### Environment

Versions described above
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8704/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1247010680,I_kwDOAMm_X85KU994,6633,Opening dataset without loading any indexes?,35968931,open,0,,,10,2022-05-24T19:06:09Z,2024-02-23T05:36:53Z,,MEMBER,,,,"### Is your feature request related to a problem?

Within pangeo-forge's internals we would like to call `open_dataset`, then `to_dict()`, and end up with a schema-like representation of the contents of the dataset. This works, but it also has the side-effect of loading all indexes into memory, even if we are loading the data values ""lazily"". 

### Describe the solution you'd like

@benbovy do you think it would be possible to (perhaps optionally) also avoid loading indexes upon opening a dataset, so that we actually don't load anything? The end result would act a bit like `ncdump` does.

### Describe alternatives you've considered

Otherwise we might have to try using xarray-schema or something but the suggestion here would be much neater and more flexible.

xref: https://github.com/pangeo-forge/pangeo-forge-recipes/issues/256

cc @rabernat @jhamman @cisaacstern ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6633/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1912094632,I_kwDOAMm_X85x-D-o,8231,xr.concat concatenates along dimensions that it wasn't asked to,35968931,open,0,,,4,2023-09-25T18:50:29Z,2024-02-14T20:30:26Z,,MEMBER,,,,"### What happened?

Here are two toy datasets designed to represent sections of a dataset that has variables living on a staggered grid. This type of dataset is common in fluid modelling (it's why xGCM exists).

```python
import xarray as xr

ds1 = xr.Dataset(
    coords={
        'x_center': ('x_center', [1, 2, 3]),
        'x_outer':  ('x_outer',  [0.5, 1.5, 2.5, 3.5]),  
    },
)

ds2 = xr.Dataset(
    coords={
        'x_center': ('x_center', [4, 5, 6]),
        'x_outer':  ('x_outer',  [4.5, 5.5, 6.5]),  
    },
)
```

Calling `xr.concat` on these with `dim='x_center'` happily concatenates them
```python
xr.concat([ds1, ds2], dim='x_center')
```
```
<xarray.Dataset>
Dimensions:   (x_outer: 7, x_center: 6)
Coordinates:
  * x_outer   (x_outer) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5
  * x_center  (x_center) int64 1 2 3 4 5 6
Data variables:
    *empty*
```
but notice that the returned result has been concatenated along *both* `x_center` and `x_outer`.

### What did you expect to happen?

I did not expect this to work. I definitely didn't expect the datasets to be concatenated along a dimension I didn't ask them to be concatenated along (i.e. `x_outer`).

What I expected to happen was that (as by default `coords='different'`) both variables would be attempted to be concatenated along the `x_center` dimension, which would have succeeded for the `x_center` variable but failed for the `x_outer` variable. Indeed, if I name the variables differently so that they are no longer coordinate variables then that is what happens:

```python
import xarray as xr

ds1 = xr.Dataset(
    data_vars={
        'a': ('x_center', [1, 2, 3]),
        'b':  ('x_outer',  [0.5, 1.5, 2.5, 3.5]),  
    },
)

ds2 = xr.Dataset(
    data_vars={
        'a': ('x_center', [4, 5, 6]),
        'b':  ('x_outer',  [4.5, 5.5, 6.5]),  
    },
)
```
```python
xr.concat([ds1, ds2], dim='x_center', data_vars='different') 
```
```
ValueError: cannot reindex or align along dimension 'x_outer' because of conflicting dimension sizes: {3, 4}
```

### Minimal Complete Verifiable Example

_No response_

### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

I was trying to create an example for which you would need the automatic combined concat/merge that happens within `xr.combine_by_coords`.

### Environment

xarray `2023.8.0`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8231/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2098882374,I_kwDOAMm_X859GmdG,8660,dtype encoding ignored during IO?,35968931,closed,0,,,3,2024-01-24T18:50:47Z,2024-02-05T17:35:03Z,2024-02-05T17:35:02Z,MEMBER,,,,"### What happened?

When I set the `.encoding['dtype']` attribute before saving a to disk, the actual on-disk representation appears to store a record of the dtype encoding, but when opening it back up in xarray I get the same dtype I had before, not the one specified in the encoding. Is that what's supposed to happen? How does this work? (This happens with both zarr and netCDF.)

### What did you expect to happen?

I expected that setting `.encoding['dtype']` would mean that once I open the data back up, it would be in the new dtype that I set in the encoding.

### Minimal Complete Verifiable Example

```Python
air = xr.tutorial.open_dataset('air_temperature')

air['air'].dtype  # returns dtype('float32')

air['air'].encoding['dtype']  # returns dtype('int16'), which already seems weird

air.to_zarr('air.zarr')  # I would assume here that the encoding actually does something during IO

# now if I check the zarr `.zarray` metadata for the `air` variable it says 
`""dtype"": `""<i2""`

air2 = xr.open_dataset('air.zarr', engine='zarr')  # open it back up

air2['air'].dtype  # returns dtype('float32'), but I expected dtype('int16')


(the same thing happens also with saving to netCDF instead of Zarr)
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

I know I didn't explicitly cast with `.asdtype`, but I'm still confused as to what the relation between the dtype encoding is supposed to be here.

I am probably just misunderstanding how this is supposed to work, but then this is arguably a docs issue, because [here it says]( https://xarray.pydata.org/en/v2024.01.1/user-guide/io.html#scaling-and-type-conversions) ""[the encoding dtype field] controls the type of the data written on disk"", which I would have thought also affects the data you get back when you open it up again?

### Environment

`main` branch of xarray","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8660/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2116695961,I_kwDOAMm_X85-KjeZ,8699,Wrapping a `kerchunk.Array` object directly with xarray,35968931,open,0,,,3,2024-02-03T22:15:07Z,2024-02-04T21:15:14Z,,MEMBER,,,,"### What is your issue?

In https://github.com/fsspec/kerchunk/issues/377 the idea came up of using the xarray API to concatenate arrays which represent parts of a zarr store - i.e. using xarray to kerchunk a large set of netCDF files instead of using `kerchunk.combine.MultiZarrToZarr`.

The [idea](https://github.com/fsspec/kerchunk/issues/377#issuecomment-1922688615) is to make something like this work for kerchunking sets of netCDF files into zarr stores

```python
ds = xr.open_mfdataset(
    '/my/files*.nc'
    engine='kerchunk',  # kerchunk registers an xarray IO backend that returns zarr.Array objects
    combine='nested',  # 'by_coords' would require actually reading coordinate data
    parallel=True,  # would use dask.delayed to generate reference dicts for each file in parallel
)

ds  # now wraps a bunch of zarr.Array / kerchunk.Array objects, no need for dask arrays

ds.kerchunk.to_zarr(store='out.zarr')  # kerchunk defines an xarray accessor that extracts the zarr arrays and serializes them (which could also be done in parallel if writing to parquet)
```

I had a go at doing this [in this notebook](https://gist.github.com/TomNicholas/d9eb8ac81d3fd214a23b5e921dbd72b7), and in doing so discovered a few potential issues with xarray's internals.

For this to work xarray has to:
- Wrap a `kerchunk.Array` object which barely defines any array API methods, including basically not supporting indexing at all,
- Store all the information present in a kerchunked Zarr store but without ever loading any data,
- Not create any indexes by default during dataset construction or during `xr.concat`,
- Not try to do anything else that can't be defined for a `kerchunk.Array`.
- Possibly we need the Lazy Indexing classes to support concatenation https://github.com/pydata/xarray/issues/4628

It's an interesting exercise in using xarray as an abstraction, with no access to real numerical values at all.

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8699/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,,13221727,issue
2099530269,I_kwDOAMm_X859JEod,8665,Error when broadcasting array API compliant class,35968931,closed,0,,,1,2024-01-25T04:11:14Z,2024-01-26T16:41:31Z,2024-01-26T16:41:31Z,MEMBER,,,,"### What happened?

Broadcasting fails for array types that strictly follow the array API standard.

### What did you expect to happen?

With a normal numpy array this obviously works fine.

### Minimal Complete Verifiable Example

```Python
import numpy.array_api as nxp

arr = nxp.asarray([[1, 2, 3], [4, 5, 6]], dtype=np.dtype('float32'))

var = xr.Variable(data=arr, dims=['x', 'y'])

var.isel(x=0)  # this is fine

var * var.isel(x=0)  # this is not

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[31], line 1
----> 1 var * var.isel(x=0)

File ~/Documents/Work/Code/xarray/xarray/core/_typed_ops.py:487, in VariableOpsMixin.__mul__(self, other)
    486 def __mul__(self, other: VarCompatible) -> Self | T_DataArray:
--> 487     return self._binary_op(other, operator.mul)

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:2406, in Variable._binary_op(self, other, f, reflexive)
   2404     other_data, self_data, dims = _broadcast_compat_data(other, self)
   2405 else:
-> 2406     self_data, other_data, dims = _broadcast_compat_data(self, other)
   2407 keep_attrs = _get_keep_attrs(default=False)
   2408 attrs = self._attrs if keep_attrs else None

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:2922, in _broadcast_compat_data(self, other)
   2919 def _broadcast_compat_data(self, other):
   2920     if all(hasattr(other, attr) for attr in [""dims"", ""data"", ""shape"", ""encoding""]):
   2921         # `other` satisfies the necessary Variable API for broadcast_variables
-> 2922         new_self, new_other = _broadcast_compat_variables(self, other)
   2923         self_data = new_self.data
   2924         other_data = new_other.data

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:2899, in _broadcast_compat_variables(*variables)
   2893 """"""Create broadcast compatible variables, with the same dimensions.
   2894 
   2895 Unlike the result of broadcast_variables(), some variables may have
   2896 dimensions of size 1 instead of the size of the broadcast dimension.
   2897 """"""
   2898 dims = tuple(_unified_dims(variables))
-> 2899 return tuple(var.set_dims(dims) if var.dims != dims else var for var in variables)

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:2899, in <genexpr>(.0)
   2893 """"""Create broadcast compatible variables, with the same dimensions.
   2894 
   2895 Unlike the result of broadcast_variables(), some variables may have
   2896 dimensions of size 1 instead of the size of the broadcast dimension.
   2897 """"""
   2898 dims = tuple(_unified_dims(variables))
-> 2899 return tuple(var.set_dims(dims) if var.dims != dims else var for var in variables)

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1479, in Variable.set_dims(self, dims, shape)
   1477     expanded_data = duck_array_ops.broadcast_to(self.data, tmp_shape)
   1478 else:
-> 1479     expanded_data = self.data[(None,) * (len(expanded_dims) - self.ndim)]
   1481 expanded_var = Variable(
   1482     expanded_dims, expanded_data, self._attrs, self._encoding, fastpath=True
   1483 )
   1484 return expanded_var.transpose(*dims)

File ~/miniconda3/envs/dev3.11/lib/python3.12/site-packages/numpy/array_api/_array_object.py:555, in Array.__getitem__(self, key)
    550 """"""
    551 Performs the operation __getitem__.
    552 """"""
    553 # Note: Only indices required by the spec are allowed. See the
    554 # docstring of _validate_index
--> 555 self._validate_index(key)
    556 if isinstance(key, Array):
    557     # Indexing self._array with array_api arrays can be erroneous
    558     key = key._array

File ~/miniconda3/envs/dev3.11/lib/python3.12/site-packages/numpy/array_api/_array_object.py:348, in Array._validate_index(self, key)
    344 elif n_ellipsis == 0:
    345     # Note boolean masks must be the sole index, which we check for
    346     # later on.
    347     if not key_has_mask and n_single_axes < self.ndim:
--> 348         raise IndexError(
    349             f""{self.ndim=}, but the multi-axes index only specifies ""
    350             f""{n_single_axes} dimensions. If this was intentional, ""
    351             ""add a trailing ellipsis (...) which expands into as many ""
    352             ""slices (:) as necessary - this is what np.ndarray arrays ""
    353             ""implicitly do, but such flat indexing behaviour is not ""
    354             ""specified in the Array API.""
    355         )
    357 if n_ellipsis == 0:
    358     indexed_shape = self.shape

IndexError: self.ndim=1, but the multi-axes index only specifies 0 dimensions. If this was intentional, add a trailing ellipsis (...) which expands into as many slices (:) as necessary - this is what np.ndarray arrays implicitly do, but such flat indexing behaviour is not specified in the Array API.
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

main branch of xarray, numpy 1.26.0","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8665/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2099550299,I_kwDOAMm_X859JJhb,8666,Error unstacking array API compliant class,35968931,closed,0,,,0,2024-01-25T04:35:09Z,2024-01-26T16:06:02Z,2024-01-26T16:06:02Z,MEMBER,,,,"### What happened?

Unstacking fails for array types that strictly follow the array API standard.

### What did you expect to happen?

This obviously works fine with a normal numpy array.

### Minimal Complete Verifiable Example

```Python
import numpy.array_api as nxp

arr = nxp.asarray([[1, 2, 3], [4, 5, 6]], dtype=np.dtype('float32'))

da = xr.DataArray(
    arr,
    coords=[(""x"", [""a"", ""b""]), (""y"", [0, 1, 2])],
)
da
stacked = da.stack(z=(""x"", ""y""))
stacked.indexes[""z""]
stacked.unstack()

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[65], line 8
      6 stacked = da.stack(z=(""x"", ""y""))
      7 stacked.indexes[""z""]
----> 8 roundtripped = stacked.unstack()
      9 arr.identical(roundtripped)

File ~/Documents/Work/Code/xarray/xarray/util/deprecation_helpers.py:115, in _deprecate_positional_args.<locals>._decorator.<locals>.inner(*args, **kwargs)
    111     kwargs.update({name: arg for name, arg in zip_args})
    113     return func(*args[:-n_extra_args], **kwargs)
--> 115 return func(*args, **kwargs)

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:2913, in DataArray.unstack(self, dim, fill_value, sparse)
   2851 @_deprecate_positional_args(""v2023.10.0"")
   2852 def unstack(
   2853     self,
   (...)
   2857     sparse: bool = False,
   2858 ) -> Self:
   2859     """"""
   2860     Unstack existing dimensions corresponding to MultiIndexes into
   2861     multiple new dimensions.
   (...)
   2911     DataArray.stack
   2912     """"""
-> 2913     ds = self._to_temp_dataset().unstack(dim, fill_value=fill_value, sparse=sparse)
   2914     return self._from_temp_dataset(ds)

File ~/Documents/Work/Code/xarray/xarray/util/deprecation_helpers.py:115, in _deprecate_positional_args.<locals>._decorator.<locals>.inner(*args, **kwargs)
    111     kwargs.update({name: arg for name, arg in zip_args})
    113     return func(*args[:-n_extra_args], **kwargs)
--> 115 return func(*args, **kwargs)

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:5581, in Dataset.unstack(self, dim, fill_value, sparse)
   5579 for d in dims:
   5580     if needs_full_reindex:
-> 5581         result = result._unstack_full_reindex(
   5582             d, stacked_indexes[d], fill_value, sparse
   5583         )
   5584     else:
   5585         result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse)

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:5474, in Dataset._unstack_full_reindex(self, dim, index_and_vars, fill_value, sparse)
   5472 if name not in index_vars:
   5473     if dim in var.dims:
-> 5474         variables[name] = var.unstack({dim: new_dim_sizes})
   5475     else:
   5476         variables[name] = var

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1684, in Variable.unstack(self, dimensions, **dimensions_kwargs)
   1682 result = self
   1683 for old_dim, dims in dimensions.items():
-> 1684     result = result._unstack_once_full(dims, old_dim)
   1685 return result

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1574, in Variable._unstack_once_full(self, dim, old_dim)
   1571 reordered = self.transpose(*dim_order)
   1573 new_shape = reordered.shape[: len(other_dims)] + new_dim_sizes
-> 1574 new_data = reordered.data.reshape(new_shape)
   1575 new_dims = reordered.dims[: len(other_dims)] + new_dim_names
   1577 return type(self)(
   1578     new_dims, new_data, self._attrs, self._encoding, fastpath=True
   1579 )

AttributeError: 'Array' object has no attribute 'reshape'
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

It fails on the `arr.reshape` call, because the array API standard has reshape be a function, not a method. 

We do in fact have an array API-compatible version of `reshape` defined in `duck_array_ops.py`, it just apparently isn't yet used everywhere we call reshape.

https://github.com/pydata/xarray/blob/037a39e249e5387bc15de447c57bfd559fd5a574/xarray/core/duck_array_ops.py#L363

### Environment

main branch of xarray, numpy 1.26.0","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8666/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
2099591300,I_kwDOAMm_X859JTiE,8667,Error using vectorized indexing with array API compliant class,35968931,open,0,,,0,2024-01-25T05:20:31Z,2024-01-25T16:07:12Z,,MEMBER,,,,"### What happened?

Vectorized indexing can fail for array types that strictly follow the array API standard.

### What did you expect to happen?

Vectorized indexing to all work.

### Minimal Complete Verifiable Example

```Python
import numpy.array_api as nxp

da = xr.DataArray(
    nxp.reshape(nxp.arange(12), (3, 4)),
    dims=[""x"", ""y""],
    coords={""x"": [0, 1, 2], ""y"": [""a"", ""b"", ""c"", ""d""]},
)

da[[0, 2, 2], [1, 3]]  # works

ind_x = xr.DataArray([0, 1], dims=[""x""])
ind_y = xr.DataArray([0, 1], dims=[""y""])

da[ind_x, ind_y]  # works

da[[0, 1], ind_x]  # doesn't work

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[157], line 1
----> 1 da[[0, 1], ind_x]

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:859, in DataArray.__getitem__(self, key)
    856     return self._getitem_coord(key)
    857 else:
    858     # xarray-style array indexing
--> 859     return self.isel(indexers=self._item_key_to_dict(key))

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:1472, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
   1469 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, ""isel"")
   1471 if any(is_fancy_indexer(idx) for idx in indexers.values()):
-> 1472     ds = self._to_temp_dataset()._isel_fancy(
   1473         indexers, drop=drop, missing_dims=missing_dims
   1474     )
   1475     return self._from_temp_dataset(ds)
   1477 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
   1478 # lists, or zero or one-dimensional np.ndarray's

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:3001, in Dataset._isel_fancy(self, indexers, drop, missing_dims)
   2997 var_indexers = {
   2998     k: v for k, v in valid_indexers.items() if k in var.dims
   2999 }
   3000 if var_indexers:
-> 3001     new_var = var.isel(indexers=var_indexers)
   3002     # drop scalar coordinates
   3003     # https://github.com/pydata/xarray/issues/6554
   3004     if name in self.coords and drop and new_var.ndim == 0:

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1130, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
   1127 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
   1129 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1130 return self[key]

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:812, in Variable.__getitem__(self, key)
    799 """"""Return a new Variable object whose contents are consistent with
    800 getting the provided key from the underlying data.
    801 
   (...)
    809 array `x.values` directly.
    810 """"""
    811 dims, indexer, new_order = self._broadcast_indexes(key)
--> 812 data = as_indexable(self._data)[indexer]
    813 if new_order:
    814     data = np.moveaxis(data, range(len(new_order)), new_order)

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:1390, in ArrayApiIndexingAdapter.__getitem__(self, key)
   1388 else:
   1389     if isinstance(key, VectorizedIndexer):
-> 1390         raise TypeError(""Vectorized indexing is not supported"")
   1391     else:
   1392         raise TypeError(f""Unrecognized indexer: {key}"")

TypeError: Vectorized indexing is not supported
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

I don't really understand why the first two examples work but the last one doesn't...

### Environment

main branch of xarray, numpy 1.26.0","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8667/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1332231863,I_kwDOAMm_X85PaD63,6894,Public testing framework for duck array integration,35968931,open,0,,,8,2022-08-08T18:23:49Z,2024-01-25T04:04:11Z,,MEMBER,,,,"### What is your issue?

In #4972 @keewis started writing a public framework for testing the integration of any duck array class in xarray, inspired by the [testing framework pandas has for `ExtensionArrays`](https://pandas.pydata.org/docs/development/extending.html#testing-extension-arrays). This is a meta-issue for what our version of that framework for wrapping numpy-like duck arrays should look like.

(Feel free to edit / add to this)

### What behaviour should we test?

We have a lot of xarray methods to test with any type of duck array. Each of these bullets should correspond to one or more testing base classes which the duck array library author would inherit from. In rough order of increasing complexity:

- [x] **Constructors** - Including for `Variable` #6903 
- [x] **Properties** - checking that `.shape`, `.dtype` etc. exist on the wrapped array, see #4285 for example  #6903
- [x] **Reductions** - #4972 also uses parameters to automatically test many methods, and hypothesis to test each method for many different array instances.
- [ ] **Unary ops**
- [ ] **Binary ops**
- [ ] **Selection**
- [ ] **Computation**
- [ ] **Combining**
- [ ] **Groupby**
- [ ] **Rolling**
- [ ] **Coarsen**
- [ ] **Weighted**

We don't need to test that the array class obeys everything else in the [Array API Standard](https://data-apis.org/array-api/latest/API_specification/index.html). (For instance [`.device`](https://data-apis.org/array-api/latest/API_specification/generated/signatures.array_object.array.device.html) is probably never going to be used by xarray directly.) We instead assume that if the array class doesn't implement something in the API standard but all the generated tests pass, then all is well.

### How extensible does our testing framework need to be?

To be able to test any type of wrapped array our testing framework needs to itself be quite flexible.

- **User-defined checking** - For some arrays `np.testing.assert_equal` is not enough to guarantee correctness, so the user creating tests needs to specify additional checks. #4972 shows how to do this for checking the units of resulting pint arrays.
- **User-created data?** - Some array libraries might need to test array data that is invalid for numpy arrays. I'm thinking specifically of testing wrapping ragged arrays. #4285
- **Parallel computing frameworks?** - Related to the last point is chunked arrays. Here the strategy requires an extra `chunks` argument when the array is created, and any results need to first call `.compute()`. Testing parallel-executed arrays might also require pretty complicated `SetUps` and `TearDowns` in fixtures too. (see also #6807)

### What documentation / examples do we need?

All of this content should really go on a [dedicated page](https://docs.xarray.dev/en/stable/user-guide/duckarrays.html) in the docs, perhaps grouped alongside other ways of extending xarray.

- [ ] Motivation
- [ ] What subset of the Array API standard we expect duck array classes to define (could point to a typing protocol?)
- [ ] Explanation that the array type needs to return the same type for any numpy-like function which xarray might call upon that type (i.e. the set of duckarray instances is closed under numpy operations)
- [ ] Explanation of the different base classes
- [ ] Simple demo of testing a toy numpy-like array class
- [ ] Point to code testing more advanced examples we actually use (e.g. sparse, pint)
- [ ] Which advanced behaviours are optional (e.g. Constructors and Properties have to work, but Groupby is optional)

### Where should duck array compatibility testing eventually live?

Right now the tests for sparse & pint are going into the xarray repo, but presumably we don't want tests for every duck array type living in this repository. I suggest that we want to work towards eventually having **no array library-specific tests in this repository at all**. (Except numpy I guess.) Thanks @crusaderky for the [original suggestion](https://github.com/pydata/xarray/issues/4285#issuecomment-667637217).

Instead all tests involving pint could live in pint-xarray, all involving sparse could live in the sparse repository (or a new sparse-xarray repo), etc. etc. We would set those test jobs to re-run when xarray is released, and then xref any issues revealed here if needs be.

We should probably also move some of our existing tests https://github.com/pydata/xarray/pull/7023#pullrequestreview-1104932752","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6894/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1716228662,I_kwDOAMm_X85mS5I2,7848,Compatibility with the Array API standard ,35968931,open,0,,,4,2023-05-18T20:34:43Z,2024-01-25T04:03:42Z,,MEMBER,,,,"### What is your issue?

**Meta-issue to track all the smaller issues around making xarray and the array API standard compatible with each other.**

We've already had
- #6804
- #7067
- #7847

and there will likely be many others.

---

I suspect this might require changes to the standard as well as to xarray - in particular see [this list](https://github.com/data-apis/array-api/issues/187) of common numpy functions which are not currently in the array API standard. Of these xarray currently uses (FYI @ralfgommers ):

- `np.clip`
- `np.diff`
- `np.pad`
- `np.repeat`
- ~`np.take`~
- ~`np.tile`~","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7848/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2088695240,I_kwDOAMm_X858fvXI,8619,Docs sidebar is squished,35968931,open,0,,,9,2024-01-18T16:54:55Z,2024-01-23T18:38:38Z,,MEMBER,,,,"### What happened?

Since the v2024.01.0 release yesterday, there seems to be a rendering error in the website - the sidebar is squished up to the left:

<img width=""594"" alt=""Screenshot 2024-01-18 at 9 50 33 AM"" src=""https://github.com/pydata/xarray/assets/35968931/0821a95c-6672-4e6b-aace-74b505d8eb04"">","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8619/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue
1940536602,I_kwDOAMm_X85zqj0a,8298,cftime.DatetimeNoLeap incorrectly decoded from netCDF file,35968931,open,0,,,14,2023-10-12T18:13:53Z,2024-01-08T01:01:53Z,,MEMBER,,,,"### What happened?

I have been given a netCDF file (I think it's netCDF3) which when I open it does not decode the time variable in the way I expected it to. The time coordinate created is a numpy object array

![Screenshot from 2023-10-12 14-10-29](https://github.com/pydata/xarray/assets/35968931/78c9de00-ed80-481a-9849-d64faa2bb408)

### What did you expect to happen?

I expected it to automatically create a coordinate backed by a `CFTimeIndex` object, not a `CFTimeIndex` object wrapped inside another array type.

### Minimal Complete Verifiable Example

The original problematic file is 455MB (I can share it if necessary), but I can create a small netCDF file that displays the same issue.

```python
import cftime

time_values = [cftime.DatetimeNoLeap(347, 2, 1, 0, 0, 0, 0, has_year_zero=True)]
time_ds = xr.Dataset(coords={'time': (['time'], time_values)})
print(time_ds)
time_ds.to_netcdf('time_mwe.nc')
```
```
<xarray.Dataset>
Dimensions:  (time: 1)
Coordinates:
  * time     (time) object 0347-02-01 00:00:00
Data variables:
    *empty*
```
```python
ds = xr.open_dataset('time_mwe.nc', engine='netcdf4', decode_times=True, use_cftime=True)
print(ds)
```
```
<xarray.Dataset>
Dimensions:  (time: 1)
Coordinates:
  * time     (time) object 0347-02-01 00:00:00
Data variables:
    *empty*
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

```
cftime 1.6.2
netcdf4 1.6.4
xarray 2023.8.0
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8298/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2038153739,I_kwDOAMm_X855e8IL,8545,map_blocks should dispatch to ChunkManager,35968931,open,0,,,5,2023-12-12T16:34:13Z,2023-12-22T16:47:27Z,,MEMBER,,,,"### Is your feature request related to a problem?

#7019 generalized most of xarrays internals to be able to use any chunked array type that we can create a `ChunkManagerEntrypoint` for. Most functions now go through this (e.g. `apply_ufunc`), but I did not redirect `xarray.map_blocks` to go through `ChunkManagerEntrypoint`.

This redirection works by dispatching to high-level dask.array primitives such as `dask.array.apply_gufunc`, `dask.array.blockwise`, and `dask.array.map_blocks`. However the current implementation of `xarray.map_blocks` is much lower-level, building a custom HLG, so it was not obvious how to swap it out.

### Describe the solution you'd like

I would like to either:

1) Replace the current internals of `xarray.map_blocks` with a simple call to `ChunkManagerEntrypoint.map_blocks`. This would be the cleanest separation of concerns we could do here. Presumably there is some obvious reason why this cannot or should not be done, but I have yet to understand what that reason is. (either @dcherian or @tomwhite can you enlighten me perhaps? 🙏) 

2) (More likely) refactor so that the existing guts of `xarray.map_blocks` are only called from the `ChunkManagerEntrypoint`, and a non-dask chunked array (i.e. cubed, but in theory other types too) would be able to specify how it wants to perform the map_blocks.

### Describe alternatives you've considered

Leaving it as the status quo breaks the nice abstraction and separation of concerns that #7019 introduced.

### Additional context

Split off from https://github.com/pydata/xarray/issues/8414","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8545/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
2027231531,I_kwDOAMm_X8541Rkr,8524,PR labeler bot broken and possibly dead,35968931,open,0,,,2,2023-12-05T22:23:44Z,2023-12-06T15:33:42Z,,MEMBER,,,,"### What is your issue?

The PR labeler bot seems to be broken

https://github.com/pydata/xarray/actions/runs/7107212418/job/19348227101?pr=8404

and even worse the repository has been archived!

https://github.com/andymckay/labeler

I actually like this bot, but unless a similar bot exists somewhere else I guess we should just delete this action 😞 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8524/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue
2019594436,I_kwDOAMm_X854YJDE,8496,"Dataset.dims should return a set, not a dict of sizes",35968931,open,0,,,8,2023-11-30T22:12:37Z,2023-12-02T03:10:14Z,,MEMBER,,,,"### What is your issue?

This is inconsistent:

```python
In [25]: ds
Out[25]: 
<xarray.Dataset>
Dimensions:  (x: 1, y: 2)
Dimensions without coordinates: x, y
Data variables:
    a        (x, y) int64 0 1

In [26]: ds['a'].dims
Out[26]: ('x', 'y')

In [27]: ds['a'].sizes
Out[27]: Frozen({'x': 1, 'y': 2})

In [28]: ds.dims
Out[28]: Frozen({'x': 1, 'y': 2})

In [29]: ds.sizes
Out[29]: Frozen({'x': 1, 'y': 2})
```

Surely `ds.dims` should return something like a `Frozenset({'x', 'y'})`? (because dimension order is meaningless when you have multiple arrays underneath - see https://github.com/pydata/xarray/issues/8498)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8496/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
552500673,MDU6SXNzdWU1NTI1MDA2NzM=,3709,Feature Proposal: `xarray.interactive` module,35968931,closed,0,,,36,2020-01-20T20:42:22Z,2023-10-27T18:24:49Z,2021-07-29T15:37:21Z,MEMBER,,,,"## Feature proposal: `xarray.interactive` module

I've been experimenting with [ipython widgets](https://github.com/jupyter-widgets/ipywidgets) in jupyter notebooks, and I've been [working on](https://github.com/TomNicholas/xarray-interactive) how we might use them to make xarray more interactive.

### Motivation:

For most users who are exploring their data, it will be common to find themselves rerunning the same cells repeatedly but with slightly different values.
In `xarray`'s case that will often be in an `.isel()` or `.sel()` call, or selecting variables from a dataset.
IPython widgets allow you to interact with your functions in a very intuitive way, which we could exploit.
There are lots of tutorials on how to interact with `pandas` data (e.g. [this great one](https://towardsdatascience.com/interactive-controls-for-jupyter-notebooks-f5c94829aee6)), but I haven't seen any for interacting with `xarray` objects.


### Relationship to other libraries:

Some downstream plotting libaries (such as @hvplot) [already use widgets](https://hvplot.holoviz.org/user_guide/Gridded_Data.html) when interactively plotting xarray-derived data structures, but they don't seem to go the full N dimensions.
This also isn't something that should be confined to plotting functions - you often choose slices or variables at the start of analysis, not just at the end.
I'll come back to this idea later.

The default ipython widgets are pretty good, but we could write an `xarray.interactive` module in such a way that downstream developers can easily replace them with [their own widgets](https://hvplot.holoviz.org/user_guide/Widgets.html).

### Usage examples:

```python
# imports
import ipywidgets as widgets
import xarray.plot as xplot
import xarray.interactive as interactive

# Load tutorial data
ds = xr.tutorial.open_dataset('air_temperature')['air']
```

Plotting against multiple dimensions interactively
```python
interactive.isel(da, xplot.plot, lat=10, lon=50)
```
![isel_lat_and_lon](https://user-images.githubusercontent.com/35968931/72755645-e632bb00-3bc2-11ea-8056-eb448e957bb0.gif)


Interactively select a range from a dimension
```python
def plot_mean_over_time(da):
    da.mean(dim=time)
interactive.isel(da, plot_mean_over_time, time=slice(100, 500))
```
![mean_over_time_slice](https://user-images.githubusercontent.com/35968931/72755638-e337ca80-3bc2-11ea-9d66-efb8dd0d4fca.gif)


Animate over one dimension
```python
from ipywidgets import Play
interactive.isel(da, xplot.plot, time=Play())
```
![Play](https://user-images.githubusercontent.com/35968931/72755630-de731680-3bc2-11ea-9d0f-46da96d6efda.gif)

### API ideas:

We can write a function like this

```python
interactive.isel(da, func=xplot.plot, time=10)
```

which could also be used as a decorator something like this
```python
@interactive.isel(da, time=10)
def plot(da)
    return xplot.plot(da)
```

It would be nicer to be able to do this
```python
@Interactive(da).isel(time=10)
def plot(da)
    return xplot.plot(da)
```
but [Guido forbade it](https://seriously.dontusethiscode.com/2013/04/21/lambda-decorators.html).

But we can attach these functions to an accessor to get
```python
da.interactive.isel(xplot.plot, time=10)
```

### Other ideas

Select variables from datasets
```python
@interactive.data_vars(da1=ds['n'], da2=ds['T'], ...)
def correlation(da1, da2, ...)
    ...

# Would produce a dropdown list of variables for each dataset
```

Choose dimensions to apply functions over
```python
@interactive.dims(dim='time')
def mean(da, dim)
    ...
    
# Would produce a dropdown list of dimensions in the dataarray
```

General `interactive.explore()` method to see variation over any number of dimensions, the default being all of them.

What do people think about this? Is it something that makes sense to include within xarray itself? (Dependencies aren't a problem because it's fine to have `ipywidgets` as an optional dependency just for this module.)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3709/reactions"", ""total_count"": 6, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 3, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1812811751,I_kwDOAMm_X85sDU_n,8008,"""Deep linking"" disparate documentation resources together",35968931,open,0,,,3,2023-07-19T22:18:55Z,2023-10-12T18:36:52Z,,MEMBER,,,,"### What is your issue?

Our docs have a general issue with having lots of related resources that are not necessarily linked together in a useful way. This results in users (including myself!) getting ""stuck"" in one part of the docs and being unaware of material that would help them solve their specific issue.

To give a concrete example, if a user wants to know about `coarsen`, there is relevant material:

- In the [coarsen class docstring](https://docs.xarray.dev/en/stable/generated/xarray.core.rolling.DatasetCoarsen.html#xarray.core.rolling.DatasetCoarsen)
- On the [reshaping page](https://docs.xarray.dev/en/stable/user-guide/reshaping.html#reshaping-via-coarsen)
- On the [computations page](https://docs.xarray.dev/en/stable/user-guide/computation.html#coarsen-large-arrays)
- On the [""how do I?"" page](https://docs.xarray.dev/en/stable/howdoi.html)
- On the [tutorial repository](https://tutorial.xarray.dev/fundamentals/03.3_windowed.html?highlight=coarsen#coarsening)

Different types of material are great, but only some of these resources are linked to others. `Coarsen` is actually pretty well covered overall, but for other functions there might be no useful linking at all, or no examples in the docstrings.

---

The biggest missed opportunity here is the way all the great content on the [tutorial.xarray.dev](https://tutorial.xarray.dev/) repository is not linked from anywhere on the main documentation site (I believe). To address that we could either (a) integrate the `tutorial.xarray.dev` material into the main site or (b) add a lot more cross-linking between the two sites.

Identifying sections that could be linked and adding links would be a great task for new contributors.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8008/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
602218021,MDU6SXNzdWU2MDIyMTgwMjE=,3980,Make subclassing easier?,35968931,open,0,,,9,2020-04-17T20:33:13Z,2023-10-04T16:27:28Z,,MEMBER,,,,"### Suggestion

We relatively regularly have [users](https://github.com/pydata/xarray/issues/728) [asking](https://github.com/pydata/xarray/issues/3959) [about](https://groups.google.com/forum/#!topic/xarray/wzprk6M-Mfg) [subclassing](https://github.com/pydata/xarray/issues/706) `DataArray` and `Dataset`, and I know of at least a few cases where people have [gone](https://github.com/pennmem/ptsa_new/blob/master/ptsa/data/timeseries.py) [through](https://github.com/pydata/xarray/issues/2176#issuecomment-391470885) with it. However we currently [explicitly discourage doing this](https://docs.xarray.dev/en/stable/internals/extending-xarray.html#composition-over-inheritance), on the basis that basically all operations will return a bare xarray object instead of the subclassed version, it's full of trip hazards, and we have the accessor interface to point people to instead.

However, while useful, the accessors aren't enough for some users, and I think we could probably do better. If we refactored internally we might be able to make it much easier to subclass.

### Example to follow in Pandas

Pandas takes an interesting approach: while they also explicitly discourage subclassing, they still try to make it easier, and [show you what you need to do](https://pandas.pydata.org/docs/development/extending.html#subclassing-pandas-data-structures)  in order for it to work.

They ask you to override some constructor properties with your own, and allow you to define your own original properties.

### Potential complications
- `.construct_dataarray` and `DataArray.__init__` are used a lot internally to reconstruct a DataArray from `dims`, `coords`, `data` etc. before returning the result of a method call. We would probably need to standardise this, before allowing users to override it.

- Pandas actually has multiple constructor properties you need to override: `_constructor`, `_constructor_sliced`, and `_constructor_expanddim`. What's the minimum set of similar constructors we would need?

- Blocking access to attributes - we current stop people from adding their own attributes quite aggressively, so that we can have attributes as an alias for variables and attrs, we would need to either relax this or better allow users to set a list of their own `_properties` which they want to register, similar to pandas.

- `__slots__` - I think something funky can happen if you inherit from a class that defines `__slots__`? 

### Documentation

I think if we do this we should also slightly refactor the relevant docs to make clear the distinction between 3 groups of people:
- **Users** - People who import and use xarray at the top-level with (ideally) no particular concern as to how it works. This is who the vast majority of the documentation is for.
- **Developers** - People who are actually improving and developing xarray upstream. This is who the [Contributing to xarray](http://xarray.pydata.org/en/stable/contributing.html) page is for.
- **Extenders** - People who want to subclass, accessorize or wrap xarray objects, in order to do something more complicated. These people are probably writing a domain-specific library which will then bring in a new set of users. There maybe aren't as many of these people, but they are really important IMO. This is implicitly who the [xarray internals](http://xarray.pydata.org/en/stable/internals.html#xarray-internals) page is aimed at, but it would be nice to make that distinction much more clear. It might also be nice to give them a guide as to ""I want to achieve X, should I use wrapping/subclassing/accessors?""

@max-sixty you had some ideas about what would need to be done for this to work?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3980/reactions"", ""total_count"": 11, ""+1"": 11, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
663235664,MDU6SXNzdWU2NjMyMzU2NjQ=,4243,Manually drop DataArray from memory?,35968931,closed,0,,,3,2020-07-21T18:54:40Z,2023-09-12T16:17:12Z,2023-09-12T16:17:12Z,MEMBER,,,,"Is it possible to deliberately drop data associated with a particular DataArray from memory?

Obviously `da.close()` exists, but what happens if you did for example
```python
ds = open_dataset(file)
da = ds[var]
da.compute()  # something that loads da into memory
da.close()  # is the memory freed up again now?
ds.something()  # what about now?
```

Also does calling python's built-in garbage collector (i.e. `gc.collect()`) do anything in this instance?

The context of this question is that I'm trying to resave some massive variables (~65GB each) that were loaded from thousands of files into just a few files for each variable. I would love to use @rabernat 's new [rechunker package](https://github.com/pangeo-data/rechunker) but I'm not sure how easily I can convert my current netCDF data to Zarr, and I'm interested in this question no matter how I end up solving the problem.

I don't currently have a particularly good understanding of file I/O and memory management in xarray, but would like to improve it. Can anyone recommend a tool I can use to answer this kind of question myself on my own machine? I suppose it would need to be able to tell me the current memory usage of specific objects, not just the total memory usage.

(@johnomotani you might be interested)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4243/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1812188730,I_kwDOAMm_X85sA846,8004,Rotation Functional Index example,35968931,open,0,,,2,2023-07-19T15:23:20Z,2023-08-24T13:26:56Z,,MEMBER,,,,"### Is your feature request related to a problem?

I'm trying to think of an example that would demonstrate the ""functional index"" pattern discussed in https://github.com/pydata/xarray/issues/3620.

I think a 2D rotation is the simplest example of an analytically-expressible, non-trivial, domain-agnostic case where you might want to back a set of multiple coordinates with a single functional index. It's also nice because there is additional information that must be passed and stored (the angle of the rotation), but that part is very simple, and domain-agnostic. I'm proposing we make this example work and put it in the custom index docs.

I had a go at making that example ([notebook here](https://gist.github.com/TomNicholas/daa15f71e38f07259c6c2e251d0fb38e)) @benbovy, but I'm confused about a couple of things:

1) How do I implement `.sel` in such a way that it supports indexing with slices (i.e. to crop my image)
2) How can I make this lazy?
3) Should the implementation be a ""MetaIndex"" (i.e. wrapping some pandas indexes)?

### Describe the solution you'd like

_No response_

### Describe alternatives you've considered

_No response_

### Additional context

This example is inspired by @jni's use case in napari, where (IIUC) they want to do a lazy functional affine transformation from pixel to physical coordinates, where the simplest example of such a transform might be a linear shear (caused by the imaging focal plane being at an angle to the physical sample).","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8004/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1801849622,I_kwDOAMm_X85rZgsW,7982,Use Meilisearch in our docs,35968931,closed,0,,,1,2023-07-12T22:29:45Z,2023-07-19T19:49:53Z,2023-07-19T19:49:53Z,MEMBER,,,,"### Is your feature request related to a problem?

Just saw this cool search thing for sphinx in a lightning talk at SciPy called Meilisearch 

Cc @dcherian 

### Describe the solution you'd like

Read about it here

https://sphinxdocs.ansys.com/version/stable/user_guide/options.html

### Describe alternatives you've considered

_No response_

### Additional context

_No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7982/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1807782455,I_kwDOAMm_X85rwJI3,7996,Stable docs build not showing latest changes after release,35968931,closed,0,,,3,2023-07-17T13:24:58Z,2023-07-17T20:48:19Z,2023-07-17T20:48:19Z,MEMBER,,,,"### What happened?

I released xarray version v2023.07.0 last night, but I'm not seeing changes to the documentation reflected in the [`https://docs.xarray.dev/en/stable/`](https://docs.xarray.dev/en/stable/) build. (In particular the Internals section now should have an entire extra page on wrapping chunked arrays.) I can however see the newest additions on [`https://docs.xarray.dev/en/latest/`](https://docs.xarray.dev/en/latest/) build. Is that how it's supposed to work?

### What did you expect to happen?

_No response_

### Minimal Complete Verifiable Example

_No response_

### MVCE confirmation

- [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
- [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

<details>


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7996/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1742035781,I_kwDOAMm_X85n1VtF,7894,"Can a ""skipna"" argument be added for Dataset.integrate() and DataArray.integrate()?",35968931,open,0,,,2,2023-06-05T15:32:35Z,2023-06-05T21:59:45Z,,MEMBER,,,,"### Discussed in https://github.com/pydata/xarray/discussions/5283

<div type='discussions-op-text'>

<sup>Originally posted by **chfite** May  9, 2021</sup>
I am using the Dataset.integrate() function and noticed that because one of my variables has a NaN in it the function returns a NaN for the integrated value for that variable. I know based on the trapezoidal rule one could not get an integrated value at the location of the NaN, but is it not possible for it to calculate the integrated values where there were regular values?

Assuming 0 for NaNs does not work because it would still integrate between the values before and after 0 and add additional area I do not want. Using DataArray.dropna() also is not sufficient because it would assume the value before the NaN is then connected to the value after the NaN and again add additional area that I would not want included. 

If a ""skipna"" functionality or something could not be added to the integrate function, does anyone have a suggestion for another way to get around to calculating my integrated area while excluding the NaNs? </div>","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7894/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1308715638,I_kwDOAMm_X85OAWp2,6807,Alternative parallel execution frameworks in xarray,35968931,closed,0,,,12,2022-07-18T21:48:10Z,2023-05-18T17:34:33Z,2023-05-18T17:34:33Z,MEMBER,,,,"### Is your feature request related to a problem?

Since early on the project xarray has supported wrapping `dask.array` objects in a first-class manner. However recent work on flexible array wrapping has made it possible to wrap all sorts of array types (and with #6804 we should support wrapping any array that conforms to the [array API standard](https://data-apis.org/array-api/latest/index.html)).

Currently though the only way to parallelize array operations with xarray ""automatically"" is to use dask. (You could use [xarray-beam](https://github.com/google/xarray-beam) or other options too but they don't ""automatically"" generate the computation for you like dask does.)

When dask is the only type of parallel framework exposing an array-like API then there is no need for flexibility, but now we have nascent projects like [cubed](https://github.com/tomwhite/cubed) to consider too. @tomwhite 

### Describe the solution you'd like

Refactor the internals so that dask is one option among many, and that any newer options can plug in in an extensible way.

In particular cubed deliberately uses the same API as `dask.array`, exposing:
1) the methods needed to conform to the array API standard
2) a `.chunk` and `.compute` method, which we could dispatch to
3) dask-like functions to create computation graphs including [`blockwise`](https://github.com/tomwhite/cubed/blob/400dc9adcf21c8b468fce9f24e8d4b8cb9ef2f11/cubed/core/ops.py#L43), [`map_blocks`](https://github.com/tomwhite/cubed/blob/400dc9adcf21c8b468fce9f24e8d4b8cb9ef2f11/cubed/core/ops.py#L221), and [`rechunk`](https://github.com/tomwhite/cubed/blob/main/cubed/primitive/rechunk.py)

I would like to see xarray able to wrap any array-like object which offers this set of methods / functions, and call the corresponding version of that method for the correct library (i.e. dask vs cubed) automatically.

That way users could try different parallel execution frameworks simply via a switch like 
```python
ds.chunk(**chunk_pattern, manager=""dask"")
```
and see which one works best for their particular problem.

### Describe alternatives you've considered

If we leave it the way it is now then xarray will not be truly flexible in this respect.

Any library can wrap (or subclass if they are really brave) xarray objects to provide parallelism but that's not the same level of flexibility.

### Additional context

[cubed repo](https://github.com/tomwhite/cubed)

[PR](https://github.com/pydata/xarray/pull/6804) about making xarray able to wrap objects conforming to the new [array API standard](https://data-apis.org/array-api/latest/index.html)

cc @shoyer @rabernat @dcherian @keewis ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6807/reactions"", ""total_count"": 6, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 3, ""rocket"": 2, ""eyes"": 1}",,completed,13221727,issue
1694956396,I_kwDOAMm_X85lBvts,7813,Task naming for general chunkmanagers,35968931,open,0,,,3,2023-05-03T22:56:46Z,2023-05-05T10:30:39Z,,MEMBER,,,,"### What is your issue?

(Follow-up to #7019)

When you create a dask graph of xarray operations, the tasks in the graph get useful names according the name of the DataArray they operate on, or whether they represent an `open_dataset` call.

Currently for cubed this doesn't work, for example this graph from https://github.com/pangeo-data/distributed-array-examples/issues/2#issuecomment-1533852877:

![image](https://user-images.githubusercontent.com/35968931/236056613-48f3925a-8aa6-418c-b204-1a57b612ff93.png)

cc @tomwhite @dcherian 

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7813/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1468534020,I_kwDOAMm_X85XiA0E,7333,FacetGrid with coords error,35968931,open,0,,,1,2022-11-29T18:42:48Z,2023-04-03T10:12:40Z,,MEMBER,,,,"There may perhaps be a small bug anyway, as DataArrays with and without coords are handled differently. Contrast:

```
da=xr.DataArray(data=np.random.randn(2,2,2,10,10),coords={'A':['a1','a2'],'B':[0,1],'C':[0,1],'X':range(10),'Y':range(10)})

p=da.sel(A='a1').plot.contour(col='B',row='C') 
try:
    p.map_dataarray(xr.plot.pcolormesh, y=""B"", x=""C"");
except Exception as e:
    print('An uninformative error:')
    print(e)
```
```
An uninformative error:
tuple index out of range

```

with:

```
da=xr.DataArray(data=np.random.randn(2,2,2,10,10))

p=da.sel(dim_0=0).plot.contour(col='dim_1',row='dim_2') 
try:
    p.map_dataarray(xr.plot.pcolormesh, y=""dim_1"", x=""dim_2"");
except Exception as e:
    print('A more informative error:')
    print(e)
```


```
A more informative error:
x must be one of None, 'dim_3', 'dim_4'

```

_Originally posted by @joshdorrington in https://github.com/pydata/xarray/discussions/7310#discussioncomment-4257643_
    ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7333/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1188523721,I_kwDOAMm_X85G127J,6431,Bug when padding coordinates with NaNs,35968931,open,0,,,2,2022-03-31T18:57:16Z,2023-03-30T13:33:10Z,,MEMBER,,,,"### What happened?

```python
da = xr.DataArray(np.arange(9), dim='x')
da.pad({'x': (0, 1)}, 'constant', constant_values=np.NAN)
```
```
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Input In [12], in <cell line: 1>()
----> 1 da.pad({'x': 1}, 'constant', constant_values=np.NAN)

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:4158, in DataArray.pad(self, pad_width, mode, stat_length, constant_values, end_values, reflect_type, **pad_width_kwargs)
   4000 def pad(
   4001     self,
   4002     pad_width: Mapping[Any, int | tuple[int, int]] | None = None,
   (...)
   4012     **pad_width_kwargs: Any,
   4013 ) -> DataArray:
   4014     """"""Pad this array along one or more dimensions.
   4015 
   4016     .. warning::
   (...)
   4156         z        (x) float64 nan 100.0 200.0 nan
   4157     """"""
-> 4158     ds = self._to_temp_dataset().pad(
   4159         pad_width=pad_width,
   4160         mode=mode,
   4161         stat_length=stat_length,
   4162         constant_values=constant_values,
   4163         end_values=end_values,
   4164         reflect_type=reflect_type,
   4165         **pad_width_kwargs,
   4166     )
   4167     return self._from_temp_dataset(ds)

File ~/Documents/Work/Code/xarray/xarray/core/dataset.py:7368, in Dataset.pad(self, pad_width, mode, stat_length, constant_values, end_values, reflect_type, **pad_width_kwargs)
   7366     variables[name] = var
   7367 elif name in self.data_vars:
-> 7368     variables[name] = var.pad(
   7369         pad_width=var_pad_width,
   7370         mode=mode,
   7371         stat_length=stat_length,
   7372         constant_values=constant_values,
   7373         end_values=end_values,
   7374         reflect_type=reflect_type,
   7375     )
   7376 else:
   7377     variables[name] = var.pad(
   7378         pad_width=var_pad_width,
   7379         mode=coord_pad_mode,
   7380         **coord_pad_options,  # type: ignore[arg-type]
   7381     )

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1360, in Variable.pad(self, pad_width, mode, stat_length, constant_values, end_values, reflect_type, **pad_width_kwargs)
   1357 if reflect_type is not None:
   1358     pad_option_kwargs[""reflect_type""] = reflect_type  # type: ignore[assignment]
-> 1360 array = np.pad(  # type: ignore[call-overload]
   1361     self.data.astype(dtype, copy=False),
   1362     pad_width_by_index,
   1363     mode=mode,
   1364     **pad_option_kwargs,
   1365 )
   1367 return type(self)(self.dims, array)

File <__array_function__ internals>:5, in pad(*args, **kwargs)

File ~/miniconda3/envs/py39/lib/python3.9/site-packages/numpy/lib/arraypad.py:803, in pad(array, pad_width, mode, **kwargs)
    801     for axis, width_pair, value_pair in zip(axes, pad_width, values):
    802         roi = _view_roi(padded, original_area_slice, axis)
--> 803         _set_pad_area(roi, axis, width_pair, value_pair)
    805 elif mode == ""empty"":
    806     pass  # Do nothing as _pad_simple already returned the correct result

File ~/miniconda3/envs/py39/lib/python3.9/site-packages/numpy/lib/arraypad.py:147, in _set_pad_area(padded, axis, width_pair, value_pair)
    130 """"""
    131 Set empty-padded area in given dimension.
    132 
   (...)
    144     broadcastable to the shape of `arr`.
    145 """"""
    146 left_slice = _slice_at_axis(slice(None, width_pair[0]), axis)
--> 147 padded[left_slice] = value_pair[0]
    149 right_slice = _slice_at_axis(
    150     slice(padded.shape[axis] - width_pair[1], None), axis)
    151 padded[right_slice] = value_pair[1]

ValueError: cannot convert float NaN to integer
```

### What did you expect to happen?

It should have successfully padded with a NaN, same as it does if you don't specify `constant_values`:

```python
In [14]: da.pad({'x': (0, 1)}, 'constant')
Out[14]: 
<xarray.DataArray (x: 3)>
array([ 0.,  1., nan])
Dimensions without coordinates: x
```


### Minimal Complete Verifiable Example

_No response_

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) 
[GCC 9.4.0]
python-bits: 64
OS: Linux
OS-release: 5.11.0-7620-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 0.20.3.dev4+gdbc02d4e
pandas: 1.4.0
numpy: 1.21.4
scipy: 1.7.3
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.10.3
cftime: 1.5.1.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.01.1
distributed: 2022.01.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.01.0
cupy: None
pint: None
sparse: None
setuptools: 59.6.0
pip: 21.3.1
conda: 4.11.0
pytest: 6.2.5
IPython: 8.2.0
sphinx: 4.4.0","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6431/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1588461863,I_kwDOAMm_X85ergEn,7539,Concat doesn't concatenate dimension coordinates along new dims,35968931,open,0,,,4,2023-02-16T22:32:33Z,2023-02-21T19:07:48Z,,MEMBER,,,,"### What is your issue?

`xr.concat` doesn't concatenate dimension coordinates along new dimensions, which leads to pretty unintuitive behavior.

Take this example (motivated by https://github.com/pydata/xarray/discussions/7532#discussioncomment-4988792)
```python
segments = []
for i in range(2):
    time = np.sort(np.random.random(4))
    da = xr.DataArray(
        np.random.randn(4,2),
        dims=[""time"", ""cols""],
        coords=dict(time=('time', time), cols=[""col1"", ""col2""]),
        )
    segments.append(da)
```
```python
In [86]: segments
Out[86]: 
[<xarray.DataArray (time: 4, cols: 2)>
 array([[-0.61199576, -0.9012078 ],
        [-0.54187577,  1.30509994],
        [-3.53720471,  0.97607797],
        [ 0.2593455 ,  0.95920031]])
 Coordinates:
   * time     (time) float64 0.1048 0.168 0.869 0.9432
   * cols     (cols) <U4 'col1' 'col2',
 <xarray.DataArray (time: 4, cols: 2)>
 array([[ 0.90266408, -0.54294821],
        [-1.09087103, -0.17484417],
        [-0.21679558, -0.57377412],
        [ 0.07570151,  0.27433728]])
 Coordinates:
   * time     (time) float64 0.03627 0.09754 0.2434 0.592
   * cols     (cols) <U4 'col1' 'col2']
```
```python
In [85]: xr.concat(segments, dim='new')
Out[85]: 
<xarray.DataArray (new: 2, time: 8, cols: 2)>
array([[[        nan,         nan],
        [        nan,         nan],
        [-0.61199576, -0.9012078 ],
        [-0.54187577,  1.30509994],
        [        nan,         nan],
        [        nan,         nan],
        [-3.53720471,  0.97607797],
        [ 0.2593455 ,  0.95920031]],

       [[ 0.90266408, -0.54294821],
        [-1.09087103, -0.17484417],
        [        nan,         nan],
        [        nan,         nan],
        [-0.21679558, -0.57377412],
        [ 0.07570151,  0.27433728],
        [        nan,         nan],
        [        nan,         nan]]])
Coordinates:
  * time     (time) float64 0.03627 0.09754 0.1048 0.168 ... 0.592 0.869 0.9432
  * cols     (cols) <U4 'col1' 'col2'
Dimensions without coordinates: new
```

I would have expected to get a result of size `{new: 2, time: 4, cols: 2}`. That would be intuitive, because the default is `coords='different'`, and that would be the result of concatenating each `time` coordinate (which have different values) and just propagating the `cols` coordinate (as they have the same values).

Instead what happened is that `xr.concat` treats the dimension coordinates as indexes to align, and defaults to an outer join. [This auto-alignment behaviour has been discussed at length before](https://github.com/pydata/xarray/issues/7045), I'm just trying to point out another place in which its problematic.

This is kind of briefly mentioned in the concat docstring under `coords='all'`:
```
“all”: All coordinate variables will be concatenated, except those corresponding to other dimensions.
```
but it's not even mentioned under `coords='different'`

I don't really know what I would prefer to happen with the coordinates. I guess to have created a `time` coordinate of size `{new: 2, time: 4, cols: 2}`, but then I don't know what that implies for the underlying index. @benbovy do you have any thoughts?

At the very least we should make this a lot clearer in the docs.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7539/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1585231355,I_kwDOAMm_X85efLX7,7533,Numpy to xarray docs,35968931,open,0,,,0,2023-02-15T05:13:50Z,2023-02-15T06:28:05Z,,MEMBER,,,,"**We should make a docs page specifically to ease the transition from pure-numpy to xarray.**

A lot of new xarray users come from already using numpy as their primary data structure. We relatively often get questions about ""what's the xarray equivalent of X numpy function"" but we don't have a dedicated place to collect those answers, or explain key conceptual differences.

I think this deserves its own dedicated docs page, with:
- [ ] High-level conceptual differences (e.g. transpose invariance)
- [ ] Arguments for the benefits of using xarray over pure numpy
- [ ] Table of numpy <-> xarray function equivalents (similar to the existing ""How do I..."" page)
- [ ] Other common recommendations for numpy users (e.g. use netCDF / Zarr instead of `.npz` or pickle to store data on disk)

For the table I thought of a few already, but I know there will be a lot more:

- `np.concatenate`/`np.vstack`/`np.hstack`/`np.stack` → `xr.concat`
- `np.block` → `xr.combine_nested`
- `np.apply_along_axis` → `xr.apply_ufunc`
- `np.polynomial` → `xr.polyfit`
- `np.reshape` -> `xr.coarsen().construct()`","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7533/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1549861293,I_kwDOAMm_X85cYQGt,7459,Error when broadcast given int,35968931,open,0,,,0,2023-01-19T19:59:31Z,2023-01-19T21:11:12Z,,MEMBER,,,,"### What happened?

Unhelpful error raised by `xr.broadcast` when supplied with an int.

### What did you expect to happen?

The broadcast to succeed I think?

### Minimal Complete Verifiable Example

```Python
In [1]: import xarray as xr

In [2]: da = xr.DataArray([5, 4], dims='x')

In [3]: xr.broadcast(da, 1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 1
----> 1 xr.broadcast(da, 1)

File ~/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/core/alignment.py:1049, in broadcast(exclude, *args)
   1047 if exclude is None:
   1048     exclude = set()
-> 1049 args = align(*args, join=""outer"", copy=False, exclude=exclude)
   1051 dims_map, common_coords = _get_broadcast_dims_map_common_coords(args, exclude)
   1052 result = [_broadcast_helper(arg, exclude, dims_map, common_coords) for arg in args]

File ~/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/core/alignment.py:772, in align(join, copy, indexes, exclude, fill_value, *objects)
    576 """"""
    577 Given any number of Dataset and/or DataArray objects, returns new
    578 objects with aligned indexes and dimension sizes.
   (...)
    762 
    763 """"""
    764 aligner = Aligner(
    765     objects,
    766     join=join,
   (...)
    770     fill_value=fill_value,
    771 )
--> 772 aligner.align()
    773 return aligner.results

File ~/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/core/alignment.py:556, in Aligner.align(self)
    553     self.results = (obj.copy(deep=self.copy),)
    554     return
--> 556 self.find_matching_indexes()
    557 self.find_matching_unindexed_dims()
    558 self.assert_no_index_conflict()

File ~/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/core/alignment.py:262, in Aligner.find_matching_indexes(self)
    259 objects_matching_indexes = []
    261 for obj in self.objects:
--> 262     obj_indexes, obj_index_vars = self._normalize_indexes(obj.xindexes)
    263     objects_matching_indexes.append(obj_indexes)
    264     for key, idx in obj_indexes.items():

AttributeError: 'int' object has no attribute 'xindexes'
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

This clearly has something to do with a change in the flexible indexes refactor, as it complains about `.xindexes` not being present. @benbovy 

### Environment

The `main` branch","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7459/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1536556849,I_kwDOAMm_X85blf8x,7447,Add Align to terminology page,35968931,open,0,,,0,2023-01-17T15:15:16Z,2023-01-17T15:15:16Z,,MEMBER,,,,"### Is your feature request related to a problem?

The terminology docs page mostly contains explanation of available classes. It should also contain explanation of words we use to describe relationships between those classes. 

For example the docstring on `xr.align` just says ""Given any number of Dataset and/or DataArray objects, returns new objects with aligned indexes and dimension sizes."", but there is no link given to a definition of what we mean by ""aligned"".

### Describe the solution you'd like

_No response_

### Describe alternatives you've considered

_No response_

### Additional context

_No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7447/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1512290017,I_kwDOAMm_X85aI7bh,7403,Zarr error when trying to overwrite part of existing store,35968931,open,0,,,3,2022-12-28T00:40:16Z,2023-01-11T21:26:10Z,,MEMBER,,,,"### What happened?

`to_zarr` threw an error when I tried to overwrite part of an existing zarr store.

### What did you expect to happen?

With mode `w` I was expecting it to overwrite part of the store with no complaints. 

I expected that because that's what the docstring of `to_zarr` says:

> `mode ({""w"", ""w-"", ""a"", ""r+"", None}, optional)` – Persistence mode: “w” means create (overwrite if exists); “w-” means create (fail if exists); “a” means override existing variables (create if does not exist);

The default mode is ""w"", so I was expecting it to overwrite.

### Minimal Complete Verifiable Example

```Python
import xarray as xr
import numpy as np
np.random.seed(0)

ds = xr.Dataset()
ds[""data""] = (['x', 'y'], np.random.random((100,100)))
ds.to_zarr(""test.zarr"")
print(ds[""data""].mean().compute())
# returns array(0.49645889) as expected

ds = xr.open_dataset(""test.zarr"", engine='zarr', chunks={})
ds[""data""].mean().compute()
print(ds[""data""].mean().compute())
# still returns array(0.49645889) as expected

ds.to_zarr(""test.zarr"", mode=""a"")
```

```python
<xarray.DataArray 'data' ()>
array(0.49645889)
<xarray.DataArray 'data' ()>
array(0.49645889)
Traceback (most recent call last):
  File ""/home/tom/Documents/Work/Code/experimentation/bugs/datatree_nans/mwe_xarray.py"", line 16, in <module>
    ds.to_zarr(""test.zarr"")
  File ""/home/tom/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/core/dataset.py"", line 2091, in to_zarr
    return to_zarr(  # type: ignore
  File ""/home/tom/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/backends/api.py"", line 1628, in to_zarr
    zstore = backends.ZarrStore.open_group(
  File ""/home/tom/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/xarray/backends/zarr.py"", line 420, in open_group
    zarr_group = zarr.open_group(store, **open_kwargs)
  File ""/home/tom/miniconda3/envs/xrdev3.9/lib/python3.9/site-packages/zarr/hierarchy.py"", line 1389, in open_group
    raise ContainsGroupError(path)
zarr.errors.ContainsGroupError: path '' contains a group
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

I would like to know what the intended result is supposed to be here, so that I can make sure datatree behaves the same way, see https://github.com/xarray-contrib/datatree/issues/168.

### Environment

Main branch of xarray, zarr v2.13.3","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7403/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1426383543,I_kwDOAMm_X85VBOK3,7232,ds.Coarsen.construct demotes non-dimensional coordinates to variables,35968931,closed,0,,,0,2022-10-27T23:39:32Z,2022-10-28T17:46:51Z,2022-10-28T17:46:51Z,MEMBER,,,,"### What happened?

`ds.Coarsen.construct` demotes non-dimensional coordinates to variables

### What did you expect to happen?

All variables that were coordinates before the coarsen.construct stay as coordinates afterwards.

### Minimal Complete Verifiable Example

```Python
In [3]: da = xr.DataArray(np.arange(24), dims=[""time""])
   ...: da = da.assign_coords(day=365 * da)
   ...: ds = da.to_dataset(name=""T"")

In [4]: ds
Out[4]: 
<xarray.Dataset>
Dimensions:  (time: 24)
Coordinates:
    day      (time) int64 0 365 730 1095 1460 1825 ... 6935 7300 7665 8030 8395
Dimensions without coordinates: time
Data variables:
    T        (time) int64 0 1 2 3 4 5 6 7 8 9 ... 14 15 16 17 18 19 20 21 22 23

In [5]: ds.coarsen(time=12).construct(time=(""year"", ""month""))
Out[5]: 
<xarray.Dataset>
Dimensions:  (year: 2, month: 12)
Coordinates:
    day      (year, month) int64 0 365 730 1095 1460 ... 7300 7665 8030 8395
Dimensions without coordinates: year, month
Data variables:
    T        (year, month) int64 0 1 2 3 4 5 6 7 8 ... 16 17 18 19 20 21 22 23
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

`main`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7232/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1424215477,I_kwDOAMm_X85U4821,7227,Typing with Variadic Generics in python 3.11 (PEP 646),35968931,open,0,,,5,2022-10-26T15:03:01Z,2022-10-26T21:50:02Z,,MEMBER,,,,"### What is your issue?

I just saw this [new typing feature](https://peps.python.org/pep-0646/) in python 3.11, and I'm wondering whether / where we could usefully use this? The feature is parametrizing `Generics` with arbitrary numbers of `TypeVars`, which allows you to have `Array` types whose static typing behaviour is a function of their `shape`. (But we could possibly use it for a tuple of `dims` too...) We might use it to do things like:

- Specify that a function expects an array of a certain dimensionality 
- Overload methods based on the array dimensionality (e.g. `.plot` for 1D vs 2D arrays)
- (If they implement [Shape Arithmetic](https://peps.python.org/pep-0646/#shape-arithmetic)) Type hint how certain methods will change the output shape?

@headtr1ck @max-sixty any thoughts?

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7227/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1372035441,I_kwDOAMm_X85Rx5lx,7031,Periodic Boundary Index,35968931,open,0,,,14,2022-09-13T21:39:40Z,2022-09-16T10:50:10Z,,MEMBER,,,,"### What is your issue?

I would like to create a `PeriodicBoundaryIndex` using the Explicit Indexes refactor. I want to do it first in 1D, then 2D, then maybe ND.

I'm thinking this would be useful for:
1) Geoscientists with periodic longitudes
2) Any scientists with periodic domains
3) Road-testing the refactor + how easy the documentation is to follow.

Eventually I think perhaps this index should live in xarray itself? As it's domain-agnostic, doesn't introduce extra dependencies, and could be a conceptually simple example of a custom index.

I had a first go, using the [`benbovy:add-set-xindex-and-drop-indexes`](https://github.com/pydata/xarray/pull/6971) branch, and reading the [in-progress docs page](https://github.com/pydata/xarray/blob/2f9a4b35de8b393ebd0370eb98ba6e81dcaa7cf0/doc/internals/how-to-create-custom-index.rst). I got a bit stuck early on though.

@benbovy here's what I have so far:

```python
import numpy as np
import pandas as pd
import xarray as xr
from xarray.core.variable import Variable
from xarray.core.indexes import PandasIndex, is_scalar

from typing import Union, Mapping, Any


class PeriodicBoundaryIndex(PandasIndex):
    """"""
    An index representing any 1D periodic numberline.
    
    Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
    """"""
        
    def _periodic_subset(self, indxr: Union[int, slice, np.ndarray]) -> pd.Index:
        """"""Equivalent of __getitem__ for a pd.Index, but respects periodicity.""""""
        
        length = len(self)
        
        if isinstance(indxr, int):
            return self.index[indxr % length]
        elif isinstance(indxr, slice):
            raise NotImplementedError()
        elif isinstance(indxr, np.ndarray):
            raise NotImplementedError()
        else:
            raise TypeError    
    
    def isel(
        self, indexers: Mapping[Any, Union[int, slice, np.ndarray, Variable]]
    ) -> Union[""PeriodicBoundaryIndex"", None]:

        print(""isel called"")

        indxr = indexers[self.dim]
        if isinstance(indxr, Variable):
            if indxr.dims != (self.dim,):
                # can't preserve a index if result has new dimensions
                return None
            else:
                indxr = indxr.data
        if not isinstance(indxr, slice) and is_scalar(indxr):
            # scalar indexer: drop index
            return None

        subsetted_index = self._periodic_subset[indxr]
        return self._replace(subsetted_index)
```

```python
airtemps = xr.tutorial.open_dataset(""air_temperature"")['air']

da = airtemps.drop_indexes(""lon"")

world = da.set_xindex(""lon"", index_cls=PeriodicBoundaryIndex)
```


Now selecting a value with isel inside the range works fine, giving the same result same as without my custom index. (The length of the example dataset along `lon` is `53`.)

```python
world.isel(lon=45)
```

```
isel called
<xarray.DataArray 'air' (time: 2920, lat: 25)>
...
```

But indexing with a `lon` value outside the range of the index data gives an `IndexError`, seemingly without consulting my new index object. It didn't even print `""isel called""` :confused:  What should I have implemented that I didn't implement?


```python
world.isel(lon=55)
```

```python
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Input In [35], in <cell line: 1>()
----> 1 world.isel(lon=55)

File ~/Documents/Work/Code/xarray/xarray/core/dataarray.py:1297, in DataArray.isel(self, indexers, drop, missing_dims, **indexers_kwargs)
   1292     return self._from_temp_dataset(ds)
   1294 # Much faster algorithm for when all indexers are ints, slices, one-dimensional
   1295 # lists, or zero or one-dimensional np.ndarray's
-> 1297 variable = self._variable.isel(indexers, missing_dims=missing_dims)
   1298 indexes, index_variables = isel_indexes(self.xindexes, indexers)
   1300 coords = {}

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:1233, in Variable.isel(self, indexers, missing_dims, **indexers_kwargs)
   1230 indexers = drop_dims_from_indexers(indexers, self.dims, missing_dims)
   1232 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims)
-> 1233 return self[key]

File ~/Documents/Work/Code/xarray/xarray/core/variable.py:793, in Variable.__getitem__(self, key)
    780 """"""Return a new Variable object whose contents are consistent with
    781 getting the provided key from the underlying data.
    782 
   (...)
    790 array `x.values` directly.
    791 """"""
    792 dims, indexer, new_order = self._broadcast_indexes(key)
--> 793 data = as_indexable(self._data)[indexer]
    794 if new_order:
    795     data = np.moveaxis(data, range(len(new_order)), new_order)

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:657, in MemoryCachedArray.__getitem__(self, key)
    656 def __getitem__(self, key):
--> 657     return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:626, in CopyOnWriteArray.__getitem__(self, key)
    625 def __getitem__(self, key):
--> 626     return type(self)(_wrap_numpy_scalars(self.array[key]))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:533, in LazilyIndexedArray.__getitem__(self, indexer)
    531     array = LazilyVectorizedIndexedArray(self.array, self.key)
    532     return array[indexer]
--> 533 return type(self)(self.array, self._updated_key(indexer))

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:505, in LazilyIndexedArray._updated_key(self, new_key)
    503         full_key.append(k)
    504     else:
--> 505         full_key.append(_index_indexer_1d(k, next(iter_new_key), size))
    506 full_key = tuple(full_key)
    508 if all(isinstance(k, integer_types + (slice,)) for k in full_key):

File ~/Documents/Work/Code/xarray/xarray/core/indexing.py:278, in _index_indexer_1d(old_indexer, applied_indexer, size)
    276         indexer = slice_slice(old_indexer, applied_indexer, size)
    277     else:
--> 278         indexer = _expand_slice(old_indexer, size)[applied_indexer]
    279 else:
    280     indexer = old_indexer[applied_indexer]

IndexError: index 55 is out of bounds for axis 0 with size 53
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7031/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1366657155,I_kwDOAMm_X85RdYiD,7010,Use sphinx-codeautolink in docs?,35968931,open,0,,,4,2022-09-08T16:35:52Z,2022-09-14T20:20:08Z,,MEMBER,,,,"> I'm a big fan of [sphinx-codeautolink](https://pypi.org/project/sphinx-codeautolink/) 🙂

_Originally posted by @Zac-HD in https://github.com/pydata/xarray/pull/6908#discussion_r963290657_

This looks cool, lets add it!","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7010/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1307212158,I_kwDOAMm_X85N6nl-,6801,Use Papyri to explore documentation,35968931,open,0,,,0,2022-07-17T21:21:21Z,2022-09-12T18:35:21Z,,MEMBER,,,,"### What is your issue?

At Scipy @Carreau demo'ed a new docs engine: [Papyri](https://github.com/jupyter/papyri). (You can find the [talk slides here](https://github.com/jupyter/papyri/issues/166)).

In short it looks awesome, and we should use it to improve our docs!

You should watch the talk, but Papyri allows:

- bidirectional crosslinking across libraries,
- navigation,
- proper reflow of user docstrings text,
- proper reflow of inline images (when rendered to html),
- proper math rendering (both in terminal and html),
and more.

There is also a [jupyter-lab extension](https://github.com/carreau/papyri-lab) in the works.

One of the examples in the talk uses xarray docs, as papyri builds from our `.rst` files.

Here I have ""ingested"" both xarray and numpy docs, which papyri's explorer dynamically links together in both directions.

![papyri_demo](https://user-images.githubusercontent.com/35968931/179425012-39db049f-96b3-4f68-98d5-ce42ae0989a8.gif)

I think this is super cool, and we should think about using it. However the project is extremely early stage, and currently has many bugs, and no unified way to ship it (the example was made locally).

I encourage other xarray devs to have a look and a think about how we can use it / benefit / test it out though!
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6801/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1337337135,I_kwDOAMm_X85PtiUv,6911,Public hypothesis strategies for generating xarray data,35968931,open,0,,,0,2022-08-12T15:17:40Z,2022-08-12T17:46:48Z,,MEMBER,,,,"### Proposal

We should expose a public set of hypothesis strategies for use in testing xarray code. It could be useful for downstream users, but also for our own internal test suite. It should live in `xarray.testing.strategies`. Specifically perhaps

- `xarray.testing.strategies.variables`
- `xarray.testing.strategies.dataarrays`
- `xarray.testing.strategies.datasets`
- (`xarray.testing.strategies.datatrees` ?)
- `xarray.testing.strategies.indexes`
- `xarray.testing.strategies.chunksizes` following [`dask.array.testing.strategies.chunks`](https://github.com/dask/dask/pull/9374)

This issue is different from #1846 because that issue describes how we could use such strategies in our own testing code, whereas this issue is for how we create general strategies that we could use in many places (including exposing publicly).

I've become interested in this as part of wanting to see #6894 happen. #6908 would effectively close this issue, but itself is just a pulled out section of all the work @keewis did in #4972.

(Also xref https://github.com/pydata/xarray/issues/2686. Also also @max-sixty didn't you have an issue somewhere about creating better and public test fixtures?)

---

### Previous work

I was pretty surprised to see this comment by @Zac-HD in #1846

> @rdturnermtl wrote [a Hypothesis extension for Xarray](https://github.com/uber/hypothesis-gufunc/blob/master/hypothesis_gufunc/extra/xr.py), which is at least a nice demo of what's possible.

given that we might have just used that instead of writing new ones in #4972! (@keewis had you already seen that extension?)

We could literally just include that extension in xarray and call this issue solved...

---

### Shrinking performance of strategies

However I was also reading about [strategies that shrink](https://github.com/HypothesisWorks/hypothesis/blob/master/guides/strategies-that-shrink.rst) yesterday and think that we should try to make some effort to come up with strategies for producing xarray objects that shrink in a performant and well-motivated manner. In particular by pooling the knowledge of the @xarray-dev core team we could try to create strategies that search for many of the edge cases that we are collectively aware of.

My understanding of that guide is that our strategies ideally should:

1) **Quickly include or exclude complexity**
    
    For instance `if draw(booleans()): # then add coordinates to generated dataset`.

    It might also be nice to have strategy constructors which allow passing other strategies in, so the user can choose how much complexity they want their strategy to generate. e.g. I think a signature like this should be possible

    ```python
    from hypothesis import strategies as st
 
    @st.composite
    def dataarrays(
        data: xr.Variable | st.SearchStrategy[xr.Variable] | duckarray | st.SearchStrategy[duckarray] | None ..., 
        coords: ...,
        dims: ...,
        attrs: ...,
        name: ...,
    ) -> st.SearchStrategy[xr.DataArray]:
        """"""
        Hypothesis strategy for generating arbitrary DataArray objects.

        Parameters
        ----------
        data
            Can pass an absolute value of an appropriate type (i.e. `Variable`, `np.ndarray` etc.), 
            or pass a strategy which generates such types.
             Default is that the generated DataArray could contain any possible data.
        ...
        (similar flexibility for other constructor arguments)
        """"""
        ...
    ```

2) **Deliberately generate known edge cases**

    For instance deliberately create:
      - dimension coordinates, 
      - names which are Hashable but not strings, 
      - multi-indexes,
      - weird dtypes,
      - NaNs,
      - duckarrays instead of `np.ndarray`,
      - inconsistent chunking between different variables,
      - (any other ideas?)

3) **Be very modular internally, to help with ""keeping things local""**

   Each sub-strategy should be in its own function, so that hypothesis' decision tree can cut branches off as soon as possible.

4) **Avoid obvious inefficiencies**

   e.g. not `.filter(...)` or `assume(...)` if we can help it, and if we do need them then keep them in the same function that generates that data. Plus just keep all sizes small by default.

Perhaps the solutions implemented in #6894 or this [hypothesis xarray extension](https://github.com/uber/hypothesis-gufunc/blob/master/hypothesis_gufunc/extra/xr.py) already meet these criteria - I'm not sure. I just wanted a dedicated place to discuss building the strategies specifically, without it getting mixed in with complicated discussions about whatever we're trying to use the strategies for!","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6911/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1230247677,I_kwDOAMm_X85JVBb9,6585,Add example of apply_ufunc + dask.array.map_blocks to docs?,35968931,open,0,,,1,2022-05-09T21:02:43Z,2022-05-09T21:10:23Z,,MEMBER,,,,"### What is your issue?

A pattern I use fairly often is `apply_ufunc(..., dask=""allowed"")` calling a function wrapped with `dask.array.map_blocks`. This is necessary to use `apply_ufunc` with *chunked core dimensions*.

AFAIK this currently isn't discussed anywhere in the docs. A sensible place to add a recipe explaining this would be just after [this section](https://docs.xarray.dev/en/stable/examples/apply_ufunc_vectorize_1d.html#Parallelization-with-dask) in your notebook @dcherian ?

@rabernat @jbusecke this is the pattern we used in xGCM FYI","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6585/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
400289716,MDU6SXNzdWU0MDAyODk3MTY=,2686,Is `create_test_data()` public API?,35968931,open,0,,,3,2019-01-17T14:00:20Z,2022-04-09T01:48:14Z,,MEMBER,,,,"We want to encourage people to use and extend xarray, and we already provide testing functions as public API to help with this.

One function I keep using when writing code which uses xarray is `xarray.tests.test_dataset.create_test_data()`. This is very useful for quickly writing tests for the same reasons that it's useful in xarray's internal tests, but it's not explicitly public API. This means that there's no guarantee it won't change/disappear, which is [not ideal](https://github.com/boutproject/xBOUT/issues/26) if you're trying to write a test suite for separate software. But so many tests in xarray rely on it that presumably it's not going to get changed. 

Is there any reason why it shouldn't be public API? Is there something I should use instead?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2686/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1042652334,I_kwDOAMm_X84-JZyu,5927,Release frequency,35968931,open,0,,,11,2021-11-02T17:53:57Z,2021-11-05T17:12:42Z,,MEMBER,,,,"In issuing the last 2 xarray releases, I've noticed a pattern, that goes something like this:
1) We don't have a release for 3+ months, for no particular reason.
2) Someone realises they want a release, to fix a bug or make a new feature available.
3) That person announces that they would like a release.
4) Lots of people (myself especially) suggest all sorts of unfinished issues that they think could or should go into that next release.
5) The dev team end up spending the better part of a week trying to finish up all of these miscellaneous PRs.
6) Finally it is deemed ""ready"" in some fairly arbitrary way.
7) The release is made manually using the ""16 easy steps"".
8) No-one wants to think about releasing again for another 3 months...

### Frequency

I mentioned this to @rabernat and he suggested that we should be releasing much more frequently.

If we released more regularly then we wouldn't have this effect of ""oh and we should try to squeeze XYZ into this release"".

I think the majority of the time xarray's CI is passing, and even when it's not it's only 1 tiny fix away from passing. That means that we in theory could release the `main` branch at practically any time, and it would be perfectly stable for users. (I personally exclusively use the most recent version of `main`.)

I also don't know of any downside to releasing very regularly (other than that someone has to issue the release).

**How about we try to release after each of the bi-weekly dev calls?** We could make it an official part of the call to end by saying:
- ""any reason why we can't release right now?""
- ""no, CI is passing"" 
- ""okay [person] volunteers to click the button right after this meeting""

That would immediately increase our release frequency by up to 6x.

### Automation

Can we automate any more steps of our release process? As far as I can tell the only steps that really need human intervention are 
- ""write the release summary"" and 
- ""check that all the automated stuff went as expected"". 

We could potentially still automate 
- ""add new section to the `whats-new.rst`"", 
- ""update the stable branch"", 
- ""update the active version of the docs"" (maybe?), and 
- ""email various mailing lists"".

@pydata/xarray thoughts?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5927/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1034238626,I_kwDOAMm_X849pTqi,5889,Release v0.20?,35968931,closed,0,,,13,2021-10-23T19:31:01Z,2021-11-02T18:38:50Z,2021-11-02T18:38:50Z,MEMBER,,,,"We should do another release soon. The last one was v0.19 on July 23rd, so it's been 3 months.

(In particular I personally want to get some small pint compatibility fixes released such as https://github.com/pydata/xarray/pull/5571 and https://github.com/pydata/xarray/pull/5886, so that the code in [this blog post](https://github.com/xarray-contrib/pint-xarray/pull/142) advertising pint-xarray integration all works.)

There's been plenty of changes since then, and there are more we could merge quite quickly. It's a breaking release because we changed some dependencies, so should be called `v0.20.0`.

@benbovy how does the ongoing index refactor stuff affect this release? Do we need to wait so it can all be announced? Can we release with merged index refactor stuff just silently sitting there? 

Small additions we could merge, feel free to suggest more @pydata/xarray :
- https://github.com/pydata/xarray/pull/5834
- https://github.com/pydata/xarray/pull/5662
- #5233
- #5900
- #5365
- #5845
- #5904
- #5911 
- #5905
- #5847
- #5916 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5889/reactions"", ""total_count"": 5, ""+1"": 5, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1020282789,I_kwDOAMm_X8480Eel,5843,Why are `da.chunks` and `ds.chunks` properties inconsistent?,35968931,closed,0,,,6,2021-10-07T17:21:01Z,2021-10-29T18:12:22Z,2021-10-29T18:12:22Z,MEMBER,,,,"Basically the title, but what I'm referring to is this:

```python
In [2]: da = xr.DataArray([[0, 1], [2, 3]], name='foo').chunk(1)

In [3]: ds = da.to_dataset()

In [4]: da.chunks
Out[4]: ((1, 1), (1, 1))

In [5]: ds.chunks
Out[5]: Frozen({'dim_0': (1, 1), 'dim_1': (1, 1)})
```

Why does `DataArray.chunks` return a tuple and `Dataset.chunks` return a frozen dictionary?

This seems a bit silly, for a few reasons:

1) it means that some perfectly reasonable code might fail unnecessarily if passed a DataArray instead of a Dataset or vice versa, such as

    ```python
    def is_core_dim_chunked(obj, core_dim):
        return len(obj.chunks[core_dim]) > 1
    ```
    which will work as intended for a dataset but raises a `TypeError` for a dataarray.

2) it breaks the pattern we use for `.sizes`, where

    ```python
    In [14]: da.sizes
    Out[14]: Frozen({'dim_0': 2, 'dim_1': 2})

    In [15]: ds.sizes
    Out[15]: Frozen({'dim_0': 2, 'dim_1': 2})
    ```

3) if you want the chunks as a tuple they are always accessible via `da.data.chunks`, which is a more sensible place to look to find the chunks without dimension names.

4) It's an undocumented difference, as the docstrings for `ds.chunks` and `da.chunks` both only say

    `""""""Block dimensions for this dataset’s data or None if it’s not a dask array.""""""`
    
    which doesn't tell me anything about the return type, or warn me that the return types are different.
    
    EDIT: In fact `DataArray.chunk` doesn't even appear to be listed on the API docs page at all.


In our codebase this difference is mostly washed out by us using `._to_temp_dataset()` all the time, and also by the way that the `.chunk()` method accepts both the tuple and dict form, so both of these invariants hold (but in different ways):

```
ds == ds.chunk(ds.chunks)
da == da.chunk(da.chunks)
```


I'm not sure whether making this consistent is worth the effort of a significant breaking change though :confused:

(Sort of related to https://github.com/pydata/xarray/issues/2103)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5843/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
939072049,MDU6SXNzdWU5MzkwNzIwNDk=,5587,Tolerance argument for `da.isin()`?,35968931,open,0,,,1,2021-07-07T16:39:42Z,2021-10-13T06:28:11Z,,MEMBER,,,,"**Is your feature request related to a problem? Please describe.**
Sometimes you want to check that data values are present in another array, but only up to a certain tolerance.

**Describe the solution you'd like**
`da.isin(test_values, tolerance=1e-6)`, where the tolerance argument is optional.

Not sure what the implementation should be but there are two vectorized [suggestions here](https://stackoverflow.com/a/51747164/3154101).

**Describe alternatives you've considered**
Different to `np.isclose` because `isin` compares all values against a flattened array, whereas `isclose` compares individual values elementwise.

**Additional context**
@jbusecke requested it.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5587/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
956103236,MDU6SXNzdWU5NTYxMDMyMzY=,5648,Duck array compatibility meeting,35968931,open,0,,,31,2021-07-29T18:31:52Z,2021-10-12T18:26:17Z,,MEMBER,,,,"## **Proposal: hold a high-level inter-library meeting to sort out roadblocks in the duck-array wrapping efforts.**

Whilst trying to get dask, pint and xarray all working nicely together, I couldn't help but notice there are [important issues](https://github.com/dask/dask/issues/6385) which conclude with a shared sentiment that ""we just need to make a decision as to what wraps what"" but since then have had essentially no codified consensus, and hence no progress for the past year. Multiply-nested duck-array wrapping is complicated and involves a lot of separate libraries (as this [graph of potential wrappings](https://pint.readthedocs.io/en/stable/numpy.html#Technical-Commentary) shows), but could be an amazingly powerful feature!

![image](https://user-images.githubusercontent.com/35968931/127528046-2999051e-b449-4d0b-a6c7-eaadb091d283.png)

I suggest that as asynchronous discussion hasn't moved this forward, we should instead hold a (hopefully one-off) meeting to make these high-level design decisions.

I'm happy to arrange the meeting, but for this to work we ideally need attendees who understand the issues from the perspective of each of the main libraries involved - some suggestions:
- xarray (@shoyer and @keewis)
- dask (@mrocklin?)
- pint (@jthielen)
- cupy? (@jacobtomlinson?)
- sparse? (@crusaderky?)
- pytorch?? (@rgommers??)

### Possible Agenda (please suggest additions!):

- Which libraries should wrap which other libraries
- Repo/NEP/etc. for standardizing wrapping order and other future decisions
- Outstanding issues to tackle first

### Background reading

- Basic idea of the numpy dispatch mechanism explained in a [blog post](https://blog.christianperone.com/2019/07/numpy-dispatcher-when-numpy-becomes-a-protocol-for-an-ecosystem/)
- @jthielen 's excellent [overview comment](https://github.com/numpy/numpy/pull/16022#issuecomment-655598493), with links to relevant NEP's 
- Pint's [technical commentary](https://pint.readthedocs.io/en/stable/numpy.html#Technical-Commentary) on array type support 

### Some related issues (there are many more - please add)

- https://github.com/pydata/xarray/issues/5559
- https://github.com/pydata/xarray/issues/3950
- dask/dask#5329
- dask/dask#6637
- dask/dask#6636
- dask/dask#6635","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5648/reactions"", ""total_count"": 9, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 5, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
935062144,MDU6SXNzdWU5MzUwNjIxNDQ=,5559,UserWarning when wrapping pint & dask arrays together,35968931,closed,0,,,4,2021-07-01T17:25:03Z,2021-09-29T17:48:39Z,2021-09-29T17:48:39Z,MEMBER,,,,"With `pint-xarray` you can create a chunked, unit-aware xarray object, but calling a calculation method and then computing doesn't appear to behave as hoped.

```python
da = xr.DataArray([1,2,3], attrs={'units': 'metres'})

chunked = da.chunk(1).pint.quantify()
```

```python
print(chunked.compute())
```
```
<xarray.DataArray (dim_0: 3)>
<Quantity([1 2 3], 'meter')>
Dimensions without coordinates: dim_0
```
So far this is fine, but if we try to take a mean before computing we get

```python
print(chunked.mean().compute())
```
```
<xarray.DataArray ()>
<Quantity(dask.array<true_divide, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>, 'meter')>
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/array/core.py:3139: UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(
```
This is not correct: as well as the UserWarning, the return value of compute is a dask array, meaning we need to compute a second time to actually get the answer:
```python
print(chunked.mean().compute().compute())
```
```
<xarray.DataArray ()>
<Quantity(2.0, 'meter')>
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/dask/array/core.py:3139: UserWarning: Passing an object to dask.array.from_array which is already a Dask collection. This can lead to unexpected behavior.
  warnings.warn(
```

If we try chunking the other way (`chunked = da.pint.quantify().pint.chunk(1)`) then we get all the same results.

xref https://github.com/xarray-contrib/pint-xarray/issues/116 and https://github.com/pydata/xarray/pull/4972 @keewis 


","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5559/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
940054482,MDU6SXNzdWU5NDAwNTQ0ODI=,5588,Release v0.19?,35968931,closed,0,,,15,2021-07-08T17:00:26Z,2021-07-23T23:15:39Z,2021-07-23T21:12:53Z,MEMBER,,,,"Yesterday in the dev call we discussed the need for another release. Not sure if this should be a bugfix release (i.e. v0.18.3) or a full release (i.e. v0.19). Last release (v0.18.2) was 19th May, with v0.18.0 on 6th May.

@pydata/xarray

Bug fixes:

- #5581 and the fix #5359 (this one needs to be released soon really)
- #5528
- Probably various smaller ones

New features:

- #4696
- #5514
- #5476
- #5464
- #5445

Internal:
- `master` -> `main` #5520
- #5506

Nice to merge first?:

- [x] #5568 and #5561
- [ ] #5571
- [x] #5586
- [ ] #5493
- [x] #4909
- [ ] #5580
- [ ] #4863
- [ ] #5501","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5588/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
446054247,MDU6SXNzdWU0NDYwNTQyNDc=,2975,Inconsistent/confusing behaviour when concatenating dimension coords,35968931,open,0,,,2,2019-05-20T11:01:37Z,2021-07-08T17:42:52Z,,MEMBER,,,,"I noticed that with multiple conflicting dimension coords then concat can give pretty weird/counterintuitive results, at least compared to what the documentation suggests they should give:

```python
# Create two datasets with conflicting coordinates
objs = [Dataset({'x': [0], 'y': [1]}), Dataset({'y': [0], 'x': [1]})]

[<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Coordinates:
   * x        (x) int64 0
   * y        (y) int64 1
 Data variables:
     *empty*, 
<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Coordinates:
   * y        (y) int64 0
   * x        (x) int64 1
 Data variables:
     *empty*]
```

```python
# Try to join along only 'x',
# coords='minimal' so concatenate ""Only coordinates in which the dimension already appears""
concat(objs, dim='x', coords='minimal') 

<xarray.Dataset>
Dimensions:  (x: 2, y: 2)
Coordinates:
  * y        (y) int64 0 1
  * x        (x) int64 0 1
Data variables:
    *empty*

# It's joined along x and y! 
```

Based on my reading of the [docstring for concat](http://xarray.pydata.org/en/stable/generated/xarray.concat.html), I would have expected this to not attempt to concatenate y, because `coords='minimal'`, and instead to throw an error because 'y' is a ""non-concatenated variable"" whose values are not the same across datasets.

Now let's try to get concat to broadcast 'y' across 'x':

```python
# Try to join along only 'x' by setting coords='different'
concat(objs, dim='x', coords='different') 
```

Now as ""Data variables which are not equal (ignoring attributes) across all datasets are also concatenated"" then I would have expected 'y' to be concatenated across 'x', i.e. to add the 'x' dimension to the 'y' coord, i.e:

```python
<xarray.Dataset>
Dimensions:  (x: 2, y: 1)
Coordinates:
  * y        (y, x) int64 1 0
  * x        (x)    int64 0 1
Data variables:
    *empty*
```
But that's not what we get!:
```
<xarray.Dataset>
Dimensions:  (x: 2, y: 2)
Coordinates:
  * y        (y) int64 0 1
  * x        (x) int64 0 1
Data variables:
    *empty*
```

### Same again but without dimension coords

If we create the same sort of objects but the variables are data vars not coords, then everything behaves exactly as expected:

```python
objs2 = [Dataset({'a': ('x', [0]), 'b': ('y', [1])}), Dataset({'a': ('x', [1]), 'b': ('y', [0])})]

[<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Dimensions without coordinates: x, y
 Data variables:
     a        (x) int64 0
     b        (y) int64 1, 
<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Dimensions without coordinates: x, y
 Data variables:
     a        (x) int64 1
     b        (y) int64 0]

concat(objs2, dim='x', data_vars='minimal')

ValueError: variable b not equal across datasets

concat(objs2, dim='x', data_vars='different')

<xarray.Dataset>
Dimensions:  (x: 2, y: 1)
Dimensions without coordinates: x, y
Data variables:
    a        (x) int64 0 1
    b        (x, y) int64 1 0
```

Also if you do the same again but with coordinates which are not dimension coords, i.e:

```python
objs3 = [Dataset(coords={'a': ('x', [0]), 'b': ('y', [1])}), Dataset(coords={'a': ('x', [1]), 'b': ('y', [0])})]

[<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Coordinates:
     a        (x) int64 0
     b        (y) int64 1
 Dimensions without coordinates: x, y
 Data variables:
     *empty*, 
<xarray.Dataset>
 Dimensions:  (x: 1, y: 1)
 Coordinates:
     a        (x) int64 1
     b        (y) int64 0
 Dimensions without coordinates: x, y
 Data variables:
     *empty*]
```
then this again gives the expected concatenation behaviour.

So this implies that the compatibility checks that are being done on the data vars are not being done on the coords, but only if they are dimension coordinates! 

Either this is not the desired behaviour or the concat docstring needs to be a lot clearer. If we agree that this is not the desired behaviour then I will have a look inside `concat` to work out why it's happening.

EDIT: Presumably this has something to do with the ToDo in the code for `concat`: `# TODO: support concatenating scalar coordinates even if the concatenated dimension already exists`...","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2975/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
936305081,MDU6SXNzdWU5MzYzMDUwODE=,5570,assert_equal does not handle wrapped duck arrays well,35968931,open,0,,,0,2021-07-03T18:27:11Z,2021-07-03T18:49:57Z,,MEMBER,,,,"Whilst trying to fix #5559 I noticed that `xarray.testing.assert_equal` (and `xarray.testing.assert_equal`) don't behave well with wrapped duck-typed arrays.

Firstly, they can give unhelpful `AssertionError` messages:

```python
In [5]: a = np.array([1,2,3])

In [6]: q = pint.Quantity([1,2,3], units='m')

In [7]: da_np = xr.DataArray(a, dims='x')

In [8]: da_p = xr.DataArray(q, dims='x')

In [9]: da_np
Out[9]: 
<xarray.DataArray (x: 3)>
array([1, 2, 3])
Dimensions without coordinates: x

In [10]: da_p
Out[10]: 
<xarray.DataArray (x: 3)>
<Quantity([1 2 3], 'meter')>
Dimensions without coordinates: x

In [11]: from xarray.testing import assert_equal

In [12]: assert_equal(da_np, da_p)
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
  flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2))
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2))
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
  flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2))
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/core/duck_array_ops.py:265: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  flag_array = (arr1 == arr2) | (isnull(arr1) & isnull(arr2))
/home/tegn500/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/numpy/core/_asarray.py:102: UnitStrippedWarning: The unit of the quantity is stripped when downcasting to ndarray.
  return array(a, dtype, copy=False, order=order)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-12-33b16d6b79ed> in <module>
----> 1 assert_equal(da_np, da_p)

    [... skipping hidden 1 frame]

~/miniconda3/envs/py38-mamba/lib/python3.8/site-packages/xarray/testing.py in assert_equal(a, b)
     79     assert type(a) == type(b)
     80     if isinstance(a, (Variable, DataArray)):
---> 81         assert a.equals(b), formatting.diff_array_repr(a, b, ""equals"")
     82     elif isinstance(a, Dataset):
     83         assert a.equals(b), formatting.diff_dataset_repr(a, b, ""equals"")

AssertionError: Left and right DataArray objects are not equal

Differing values:
L
    array([1, 2, 3])
R
    array([1, 2, 3])
```
These are different, but not because the array values are different. At the moment `.values` is converting the wrapped array type by stripping the units too - it might be better to check the type of the wrapped array first, then use `.values` to compare. Or could we even do duck-typed testing by delegating via `expected.data.equals(actual.data)`? (EDIT: I don't think a `.equals()` method exists in the numpy API, but you could do the equivalent of `assert all(expected.data == actual.data)`

Secondly, given that we coerce before comparison, I think it's possible that `assert_equal` could say two different wrapped duck-type arrays are equal when they are not, just because `np.asarray()` coerces them to the same values.

EDIT2: Looks like there is some discussion [here](https://github.com/pydata/xarray/pull/3706#issuecomment-583259053)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5570/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
911663002,MDU6SXNzdWU5MTE2NjMwMDI=,5438,Add Union Operators for Dataset,35968931,closed,0,,,2,2021-06-04T16:21:06Z,2021-06-04T16:35:36Z,2021-06-04T16:35:36Z,MEMBER,,,,"As of python 3.9, python dictionaries now support being merged via
```python
c = a | b
```
and updated via
```python
c = a |= b
```
see [PEP 584](https://www.python.org/dev/peps/pep-0584/#abstract).

`xarray.Dataset` is dict-like, so it would make sense to support the same syntax for merging. The way to achieve that is by adding new dunder methods to `xarray.Dataset`, something like

```python
def __or__(self, other):
    if not isinstance(other, xr.Dataset):
        return NotImplemented
    new = xr.merge(self, other)
    return new

def __ror__(self, other):
    if not isinstance(other, xr.Dataset):
        return NotImplemented
    new = xr.merge(self, other)
    return new

def __ior__(self, other):
    self.merge(other)
    return self
```

The distinction between the intent of these different operators is whether a new object is returned or the original object is updated.

This would allow things like `(ds1 | ds2).to_netcdf()`

(This feature doesn't require python 3.9, it merely echoes a feature that is only available in 3.9+)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5438/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
871111282,MDU6SXNzdWU4NzExMTEyODI=,5236,Error collecting tests due to optional pint import,35968931,closed,0,,,2,2021-04-29T15:01:13Z,2021-04-29T15:32:08Z,2021-04-29T15:32:08Z,MEMBER,,,,"When I try to run xarray's test suite locally with pytest I've suddenly started getting this weird error:

```
(xarray-dev) tegn500@fusion192:~/Documents/Work/Code/xarray$ pytest xarray/tests/test_backends.py
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.9.2, pytest-6.2.3, py-1.10.0, pluggy-0.13.1
rootdir: /home/tegn500/Documents/Work/Code/xarray, configfile: setup.cfg
collected 0 items / 1 error                                                                                                                                                                  

=========================================================================================== ERRORS ===========================================================================================
_______________________________________________________________________ ERROR collecting xarray/tests/test_backends.py _______________________________________________________________________
../../../../anaconda3/envs/xarray-dev/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:972: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:228: in _call_with_frames_removed
    ???
<frozen importlib._bootstrap>:1030: in _gcd_import
    ???
<frozen importlib._bootstrap>:1007: in _find_and_load
    ???
<frozen importlib._bootstrap>:986: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:680: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:790: in exec_module
    ???
<frozen importlib._bootstrap>:228: in _call_with_frames_removed
    ???
xarray/tests/__init__.py:84: in <module>
    has_pint_0_15, requires_pint_0_15 = _importorskip(""pint"", minversion=""0.15"")
xarray/tests/__init__.py:46: in _importorskip
    if LooseVersion(mod.__version__) < LooseVersion(minversion):
E   AttributeError: module 'pint' has no attribute '__version__'
================================================================================== short test summary info ===================================================================================
ERROR xarray/tests/test_backends.py - AttributeError: module 'pint' has no attribute '__version__'
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
====================================================================================== 1 error in 0.88s ======================================================================================
```

I'm not sure whether this is my fault or a problem with xarray somehow. @keewis have you seen this happen before? This is with a fresh conda environment, running locally on my laptop, and on python 3.9.2. Pint isn't even in this environment. I can force it to proceed with the tests by also catching the attribute error, i.e.

```python
def _importorskip(modname, minversion=None):
    try:
        mod = importlib.import_module(modname)
        has = True
        if minversion is not None:
            if LooseVersion(mod.__version__) < LooseVersion(minversion):
                raise ImportError(""Minimum version not satisfied"")
    except (ImportError, AttributeError):
        has = False
```

but I obviously shouldn't need to do that. Any ideas? 

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

INSTALLED VERSIONS
------------------
commit: a5e72c9aacbf26936844840b75dd59fe7d13f1e6
python: 3.9.2 | packaged by conda-forge | (default, Feb 21 2021, 05:02:46) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 4.8.10-040810-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.8.0

xarray: 0.15.2.dev545+ga5e72c9
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.3
netCDF4: 1.5.6
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: 2.8.1
cftime: 1.4.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.04.1
distributed: 2021.04.1
matplotlib: 3.4.1
cartopy: installed
seaborn: None
numbagg: None
pint: installed
setuptools: 49.6.0.post20210108
pip: 21.1
conda: None
pytest: 6.2.3
IPython: None
sphinx: None

</details>


**Conda Environment**:

<details><summary>Output of <tt>conda list</tt></summary>

# packages in environment at /home/tegn500/anaconda3/envs/xarray-dev:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
alsa-lib                  1.2.3                h516909a_0    conda-forge
asciitree                 0.3.3                      py_2    conda-forge
attrs                     20.3.0             pyhd3deb0d_0    conda-forge
bokeh                     2.3.1            py39hf3d152e_0    conda-forge
bottleneck                1.3.2            py39hce5d2b2_3    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.17.1               h7f98852_1    conda-forge
ca-certificates           2020.12.5            ha878542_0    conda-forge
certifi                   2020.12.5        py39hf3d152e_1    conda-forge
cftime                    1.4.1            py39hce5d2b2_0    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
curl                      7.76.1               h979ede3_1    conda-forge
cycler                    0.10.0                     py_2    conda-forge
cytoolz                   0.11.0           py39h3811e60_3    conda-forge
dask                      2021.4.1           pyhd8ed1ab_0    conda-forge
dask-core                 2021.4.1           pyhd8ed1ab_0    conda-forge
dbus                      1.13.6               h48d8840_2    conda-forge
distributed               2021.4.1         py39hf3d152e_0    conda-forge
expat                     2.3.0                h9c3ff4c_0    conda-forge
fasteners                 0.14.1                     py_3    conda-forge
fontconfig                2.13.1            hba837de_1005    conda-forge
freetype                  2.10.4               h0708190_1    conda-forge
fsspec                    2021.4.0           pyhd8ed1ab_0    conda-forge
gettext                   0.19.8.1          h0b5b191_1005    conda-forge
glib                      2.68.1               h9c3ff4c_0    conda-forge
glib-tools                2.68.1               h9c3ff4c_0    conda-forge
gst-plugins-base          1.18.4               hf529b03_2    conda-forge
gstreamer                 1.18.4               h76c114f_2    conda-forge
hdf4                      4.2.13            h10796ff_1005    conda-forge
hdf5                      1.10.6          nompi_h6a2412b_1114    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
icu                       68.1                 h58526e2_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
jinja2                    2.11.3             pyh44b312d_0    conda-forge
jpeg                      9d                   h36c2ea0_0    conda-forge
kiwisolver                1.3.1            py39h1a9c180_1    conda-forge
krb5                      1.17.2               h926e7f8_0    conda-forge
lcms2                     2.12                 hddcbb42_0    conda-forge
ld_impl_linux-64          2.35.1               hea4e1c9_2    conda-forge
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
libclang                  11.1.0          default_ha53f305_0    conda-forge
libcurl                   7.76.1               hc4aaa36_1    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libevent                  2.1.10               hcdb4288_3    conda-forge
libffi                    3.3                  h58526e2_2    conda-forge
libgcc-ng                 9.3.0               h2828fa1_19    conda-forge
libgfortran-ng            9.3.0               hff62375_19    conda-forge
libgfortran5              9.3.0               hff62375_19    conda-forge
libglib                   2.68.1               h3e27bee_0    conda-forge
libgomp                   9.3.0               h2828fa1_19    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libllvm11                 11.1.0               hf817b99_2    conda-forge
libnetcdf                 4.8.0           nompi_hfa85936_101    conda-forge
libnghttp2                1.43.0               h812cca2_0    conda-forge
libogg                    1.3.4                h7f98852_1    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libopus                   1.3.1                h7f98852_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libpq                     13.2                 hfd2b0eb_2    conda-forge
libssh2                   1.9.0                ha56f1ee_6    conda-forge
libstdcxx-ng              9.3.0               h6de172a_19    conda-forge
libtiff                   4.2.0                hdc55705_1    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libvorbis                 1.3.7                h9c3ff4c_0    conda-forge
libwebp-base              1.2.0                h7f98852_2    conda-forge
libxcb                    1.13              h7f98852_1003    conda-forge
libxkbcommon              1.0.3                he3ba5ed_0    conda-forge
libxml2                   2.9.10               h72842e0_4    conda-forge
libzip                    1.7.3                h4de3113_0    conda-forge
locket                    0.2.0                      py_2    conda-forge
lz4-c                     1.9.3                h9c3ff4c_0    conda-forge
markupsafe                1.1.1            py39h3811e60_3    conda-forge
matplotlib                3.4.1            py39hf3d152e_0    conda-forge
matplotlib-base           3.4.1            py39h2fa2bec_0    conda-forge
monotonic                 1.5                        py_0    conda-forge
more-itertools            8.7.0              pyhd8ed1ab_1    conda-forge
msgpack-python            1.0.2            py39h1a9c180_1    conda-forge
mysql-common              8.0.23               ha770c72_1    conda-forge
mysql-libs                8.0.23               h935591d_1    conda-forge
ncurses                   6.2                  h58526e2_4    conda-forge
netcdf4                   1.5.6           nompi_py39hc6dca20_103    conda-forge
nspr                      4.30                 h9c3ff4c_0    conda-forge
nss                       3.64                 hb5efdd6_0    conda-forge
numcodecs                 0.7.3            py39he80948d_0    conda-forge
numpy                     1.20.2           py39hdbf815f_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openjpeg                  2.4.0                hf7af979_0    conda-forge
openssl                   1.1.1k               h7f98852_0    conda-forge
packaging                 20.9               pyh44b312d_0    conda-forge
pandas                    1.2.4            py39hde0f152_0    conda-forge
partd                     1.2.0              pyhd8ed1ab_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pillow                    8.1.2            py39hf95b381_1    conda-forge
pip                       21.1               pyhd8ed1ab_0    conda-forge
pluggy                    0.13.1           py39hf3d152e_4    conda-forge
psutil                    5.8.0            py39h3811e60_1    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
py                        1.10.0             pyhd3deb0d_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.12.3           py39hf3d152e_7    conda-forge
pyqt-impl                 5.12.3           py39h0fcd23e_7    conda-forge
pyqt5-sip                 4.19.18          py39he80948d_7    conda-forge
pyqtchart                 5.12             py39h0fcd23e_7    conda-forge
pyqtwebengine             5.12.1           py39h0fcd23e_7    conda-forge
pytest                    6.2.3            py39hf3d152e_0    conda-forge
python                    3.9.2           hffdb5ce_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.9                      1_cp39    conda-forge
pytz                      2021.1             pyhd8ed1ab_0    conda-forge
pyyaml                    5.4.1            py39h3811e60_0    conda-forge
qt                        5.12.9               hda022c4_4    conda-forge
readline                  8.1                  h46c0cb4_0    conda-forge
scipy                     1.6.3            py39hee8e79c_0    conda-forge
setuptools                49.6.0           py39hf3d152e_3    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sortedcontainers          2.3.0              pyhd8ed1ab_0    conda-forge
sqlite                    3.35.5               h74cdb3f_0    conda-forge
tblib                     1.7.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.10               h21135ba_1    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py39h3811e60_1    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
tzdata                    2021a                he74cb21_0    conda-forge
wheel                     0.36.2             pyhd3deb0d_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
zarr                      2.8.1              pyhd8ed1ab_0    conda-forge
zict                      2.0.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
zstd                      1.4.9                ha95c52a_0    conda-forge

</details>

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5236/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
671609109,MDU6SXNzdWU2NzE2MDkxMDk=,4300,General curve fitting method,35968931,closed,0,,,9,2020-08-02T12:35:49Z,2021-03-31T16:55:53Z,2021-03-31T16:55:53Z,MEMBER,,,,"Xarray should have a general curve-fitting function as part of its main API.

## Motivation

Yesterday I wanted to fit a simple decaying exponential function to the data in a DataArray and realised there currently isn't an immediate way to do this in xarray. You have to either pull out the `.values` (losing the power of dask), or use `apply_ufunc` (complicated).

This is an incredibly common, domain-agnostic task, so although I don't think we should support various kinds of unusual optimisation procedures (which could always go in an extension package instead), I think a basic fitting method is within scope for the main library. There are [SO questions](https://stackoverflow.com/questions/62987617/using-scipy-curve-fit-with-dask-xarray) asking how to achieve this.

We already have [`.polyfit` and `polyval` anyway](https://github.com/pydata/xarray/pull/3733/files#), which are more specific. (@AndrewWilliams3142 and @aulemahal I expect you will have thoughts on how implement this generally.)

## Proposed syntax

I want something like this to work:

```python
def exponential_decay(xdata, A=10, L=5):
    return A*np.exp(-xdata/L)

# returns a dataset containing the optimised values of each parameter
fitted_params = da.fit(exponential_decay)

fitted_line = exponential_decay(da.x, A=fitted_params['A'], L=fitted_params['L'])

# Compare
da.plot(ax)
fitted_line.plot(ax)
```

It would also be nice to be able to fit in multiple dimensions. That means both for example fitting a 2D function to 2D data:

```python
def hat(xdata, ydata, h=2, r0=1):
    r = xdata**2 + ydata**2
    return h*np.exp(-r/r0)

fitted_params = da.fit(hat)

fitted_hat = hat(da.x, da.y, h=fitted_params['h'], r0=fitted_params['r0'])
```

but also repeatedly fitting a 1D function to 2D data:

```python
# da now has a y dimension too
fitted_params = da.fit(exponential_decay, fit_along=['x'])

# As fitted_params now has y-dependence, broadcasting means fitted_lines does too
fitted_lines = exponential_decay(da.x, A=fitted_params.A, L=fitted_params.L)
```
The latter would be useful for fitting the same curve to multiple model runs, but means we need some kind of `fit_along` or `dim` argument, which would default to all dims.

So the method docstring would end up like
```python
def fit(self, f, fit_along=None, skipna=None, full=False, cov=False):
    """"""
    Fits the function f to the DataArray.

    Expects the function f to have a signature like
    `result = f(*coords, **params)`
    for example
    `result_da = f(da.xcoord, da.ycoord, da.zcoord, A=5, B=None)`
    The names of the `**params` kwargs will be used to name the output variables.

    Returns
    -------
    fit_results - A single dataset which contains the variables (for each parameter in the fitting function):
    `param1`
        The optimised fit coefficients for parameter one.
    `param1_residuals`
        The residuals of the fit for parameter one.
    ...
    """"""

```

## Questions

1) Should it wrap `scipy.optimise.curve_fit`, or reimplement it? 

    Wrapping it is simpler, but as it just calls `least_squares` [under the hood](https://github.com/scipy/scipy/blob/v1.5.2/scipy/optimize/minpack.py#L532-L834) then reimplementing it would mean we could use the dask-powered version of `least_squares` (like [`da.polyfit does`](https://github.com/pydata/xarray/blob/9058114f70d07ef04654d1d60718442d0555b84b/xarray/core/dataset.py#L5987)).

2) What form should we expect the curve-defining function to come in?

    `scipy.optimize.curve_fit` expects the curve to act as `ydata = f(xdata, *params) + eps`, but in xarray then `xdata` could be one or multiple coords or dims, not necessarily a single array. Might it work to require a signature like `result_da = f(da.xcoord, da.ycoord, da.zcoord, ..., **params)`? Then the `.fit` method would be work out how many coords to pass to `f` based on the dimension of the `da` and the `fit_along` argument. But then the order of coord arguments in the signature of `f` would matter, which doesn't seem very xarray-like.

3) Is it okay to inspect parameters of the curve-defining function?

    If we tell the user the curve-defining function has to have a signature like `da = func(*coords, **params)`, then we could read the names of the parameters by inspecting the function kwargs. Is that a good idea or might it end up being unreliable? Is the `inspect` standard library module the right thing to use for that? This could also be used to provide default guesses for the fitting parameters.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4300/reactions"", ""total_count"": 4, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,completed,13221727,issue
604218952,MDU6SXNzdWU2MDQyMTg5NTI=,3992,"DataArray.integrate has a 'dim' arg, but Dataset.integrate has a 'coord' arg",35968931,closed,0,,,1,2020-04-21T19:12:03Z,2021-01-29T22:59:30Z,2021-01-29T22:59:30Z,MEMBER,,,,"This is just a minor gripe but I think it should be fixed.

The API syntax is inconsistent:
```python
ds.differentiate(coord='x')
da.differentiate(coord='x')
ds.integrate(coord='x')
da.integrate(dim='x')   # why dim??
```
It should definitely be `coord` - IMO it doesn't make sense to integrate or differentiate over a dim because a dim by definition has no information about the distance between grid points. I think because the distinction between dims and coords is one of the things that new users have to learn about, we should be strict to not confuse up the meanings in the documentation/API.

The discussion on the original PR [seems to agree](https://github.com/pydata/xarray/pull/2653#discussion_r246164990), so I think this was just an small oversight.

The only question is whether it requires a deprecation cycle?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3992/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
453126577,MDU6SXNzdWU0NTMxMjY1Nzc=,3002,plot.pcolormesh fails with shading='gouraud',35968931,closed,0,,,5,2019-06-06T16:27:00Z,2020-11-29T16:28:32Z,2019-06-06T22:26:35Z,MEMBER,,,,"`xarray.plot.pcolormesh()` fails when you pass the `matplotlib.pyplot.pcolormesh()` keyword argument `shading='gouraud'` to it.

#### Code Sample, a copy-pastable example if possible

```python
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

lon, lat = np.meshgrid(np.linspace(-20, 20, 5), np.linspace(0, 30, 4))
lon += lat/10
lat += lon/10

da = xr.DataArray(np.arange(20).reshape(4, 5), dims=['y', 'x'],
                  coords = {'lat': (('y', 'x'), lat),
                            'lon': (('y', 'x'), lon)})

da.plot.pcolormesh('lon', 'lat', shading='gouraud')
plt.show()
```

#### Problem description

This gives an error:

```
Traceback (most recent call last):
  File ""mwe.py"", line 17, in <module>
    da.plot.pcolormesh('lon', 'lat', shading='gouraud')
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/plot/plot.py"", line 721, in plotmethod
    return newplotfunc(**allargs)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/plot/plot.py"", line 662, in newplotfunc
    **kwargs)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/plot/plot.py"", line 864, in pcolormesh
    primitive = ax.pcolormesh(x, y, z, **kwargs)
  File ""/home/tegn500/anaconda3/envs/py36/lib/python3.6/site-packages/matplotlib/__init__.py"", line 1805, in inner
    return func(ax, *args, **kwargs)
  File ""/home/tegn500/anaconda3/envs/py36/lib/python3.6/site-packages/matplotlib/axes/_axes.py"", line 5971, in pcolormesh
    X, Y, C = self._pcolorargs('pcolormesh', *args, allmatch=allmatch)
  File ""/home/tegn500/anaconda3/envs/py36/lib/python3.6/site-packages/matplotlib/axes/_axes.py"", line 5559, in _pcolorargs
    C.shape, Nx, Ny, funcname))
TypeError: Dimensions of C (4, 5) are incompatible with X (6) and/or Y (5); see help(pcolormesh)
```

#### Expected Output

This should give almost the same image as in the documentation, just with smoother shading:

![Figure_1](https://user-images.githubusercontent.com/35968931/59049474-f68a6580-887f-11e9-83db-697c38acdf5e.png)

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3002/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
349026158,MDU6SXNzdWUzNDkwMjYxNTg=,2355,Animated plots - a suggestion for implementation,35968931,closed,0,,,9,2018-08-09T08:23:17Z,2020-08-16T08:07:12Z,2020-08-16T08:07:12Z,MEMBER,,,,"**It'd be awesome if one could animate the plots xarray creates using matplotlib just by specifying the dimension over which to animate the plot.**

This would allow for rapid visualisation of time-evolving data and could potentially be very powerful (imagine a grid of faceted 2d plots, all evolving together over time). I know that there are already some libraries which can create animated plots of xarray data (e.g. Holoviews), but I think that it's within xarray's scope (#2030) to add another dimension to its default matplotlib-style plotting capabilities.

**How?**

I saw this new package for making it easier to animate matplotlib plots using the funcanimation module: [animatplot](https://github.com/t-makaro/animatplot). It essentially works by wrapping matplotlib commands like `plt.imshow()` to instead return ""blocks"". These blocks can then be animated by feeding them into an `animation` class. An introductory script to plot line data can be found [here](https://animatplot.readthedocs.io/en/latest/tutorial/getting_started..html), but basically has the form

```python
import animatplot as amp
import matplotlib.pyplot as plt

X, Y = load_data_somehow
block = amp.blocks.Line(X, Y)
anim = amp.Animation([block])

anim.save_gif(""animated_line"") 
plt.show()
```

which creates a basic gif like this: ![animated line gif](https://user-images.githubusercontent.com/35968931/43885402-a3373002-9b6d-11e8-9b3d-f4e588a71a22.gif)

I think that it might be possible to integrate this kind of animation-plotting tool by adding an optional dimension argument to xarray's plotting methods, which if given causes the function to call the wrapped animatplot plotting command instead of the bare matplotlib one. It would then return the corresponding ""block"" ready to be animated. Using the resulting code might only require a few lines to create an impressive visualisation:

```python
turb2d = xr.load_dataset(""turbulent_fluid_data.nc"")

block = turb2d[""density""].plot.imshow(animate_over='time')
anim = Animation([block])

anim.save_gif(""fluid_density.gif"") 
plt.show()
``` 

![n_over_time](https://user-images.githubusercontent.com/35968931/43887058-83d4161c-9b72-11e8-978d-fcb8e071a37a.gif)


**What would need changing?**

If we take the `da.plot.imshow()` example, then the way I'm imagining this would be done is to add the optional argument `animate_over` to the `plot_2d` decorator, and use it to choose between returning the matplotlib artist (as it does currently) or the ""block"". It would also mean altering the logic inside `plot_2d` and `imshow` to account for the fact you would be calling this on a 3D dataarray instead of a 2D one.

I wanted to ask about this before delving into the code too much or submitting a pull request, in case  there is some problem with the idea. What do you think?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2355/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
332987740,MDU6SXNzdWUzMzI5ODc3NDA=,2235,Adding surface plot for 2D data,35968931,closed,0,,,2,2018-06-16T13:36:10Z,2020-06-17T04:49:50Z,2020-06-17T04:49:50Z,MEMBER,,,,"I am interested in adding the ability to plot [surface plots](https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html#surface-plots) of 2D xarray data using matplotlib's 3D plotting function `plot_surface()`.

This would be nice because a surface in 3D is much more useful for showing certain features of 2D data then color plots are. For example an outlier would appear as an obvious spike rather than just a single bright point as it would when using `plot.imshow()`. I'm not suggesting adding full 3D plotting capability, just the ability to visualise 2D data as a surface in 3D.

The code would end up allowing you to just call `xr.Dataarray.plot.surface()` to create something like this  example from [here](https://matplotlib.org/mpl_toolkits/mplot3d/tutorial.html#surface-plots) ([code here](https://matplotlib.org/mpl_examples/mplot3d/surface3d_demo.py)):

![Example surface plot](https://matplotlib.org/mpl_examples/mplot3d/surface3d_demo.png)

Obviously xarray would be used to automatically set the axes labels and title and so on.

As far as I can tell it wouldn't be too difficult to do, it would just be implemented as another 2D plotting method the same way as the `Dataarray.plot.imshow()`, `Dataarray.plot.contour()` etc methods currently are. It would require the imports
```python
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
```
but these would only need to be imported if this type of plot was chosen.

I would be interested in trying to add this myself, but I've never contributed to an open-source project before. Is this a reasonable thing for me to try? Can anyone see any immediate difficulties with this? Would I just need to have a go and then submit a pull request?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2235/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
594688816,MDU6SXNzdWU1OTQ2ODg4MTY=,3939,Why don't we allow indexing with keyword args via __call__?,35968931,closed,0,,,4,2020-04-05T22:44:18Z,2020-04-09T05:14:46Z,2020-04-09T05:14:46Z,MEMBER,,,,"Reading about [PEP472](https://www.python.org/dev/peps/pep-0472/), which would have allowed indexing with keyword arguments like
```python
da[x=10]
```
made me wonder: why don't we use `__call__` to get the same effect but just with curved brackets instead of square ones? i.e.
```python
da(x=10)
```
We don't currently use `__call__` on `DataArray` or `Dataset` for anything else.

I presume there is some good reason why this design decision was taken, but I'm just wondering what it is.

(Also has the [ship permanently sailed](https://mail.python.org/pipermail/python-dev/2019-March/156693.html) on PEP472 now?)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3939/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
474247717,MDU6SXNzdWU0NzQyNDc3MTc=,3168,apply_ufunc erroneously operating on an empty array when dask used,35968931,closed,0,,,3,2019-07-29T20:44:23Z,2020-03-30T15:08:16Z,2020-03-30T15:08:15Z,MEMBER,,,,"#### Problem description

`apply_ufunc` with `dask='parallelized'` appears to be trying to act on an empty numpy array when the computation is specified, but before `.compute()` is called. In other words, a ufunc which just prints the shape of its argument will print `(0,0)` then print the correct shape once `.compute()` is called.

#### Minimum working example

```python
import numpy as np
import xarray as xr


def example_ufunc(x):
    print(x.shape)
    return np.mean(x, axis=-1)

def new_mean(da, dim):
    result = xr.apply_ufunc(example_ufunc, da,
                            input_core_dims=[[dim]], dask='parallelized',
                            output_dtypes=[da.dtype])
    return result


shape = {'t': 2, 'x':3}
data = xr.DataArray(data=np.random.rand(*shape.values()), dims=shape.keys())
unchunked = data
chunked = data.chunk(shape)


actual = new_mean(chunked, dim='x')  # raises the warning
print(actual)

print(actual.compute())  # does the computation correctly
```

#### Result

```
(0, 0)
/home/tnichol/anaconda3/envs/py36/lib/python3.6/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
<xarray.DataArray (t: 2)>
dask.array<shape=(2,), dtype=float64, chunksize=(2,)>
Dimensions without coordinates: t
(2, 3)
<xarray.DataArray (t: 2)>
array([0.147205, 0.402913])
Dimensions without coordinates: t
```

#### Expected result

Same thing without the `(0,0)` or the numpy warning.


#### Output of ``xr.show_versions()``
(my xarray is up-to-date with master)

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.6 |Anaconda, Inc.| (default, Oct  9 2018, 12:34:16) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-862.14.4.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
libhdf5: 1.10.2
libnetcdf: 4.6.1

xarray: 0.12.3+23.g1d7bcbd
pandas: 0.24.2
numpy: 1.16.4
scipy: 1.3.0
netCDF4: 1.4.2
pydap: None
h5netcdf: None
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.1.0
distributed: 2.1.0
matplotlib: 3.1.0
cartopy: None
seaborn: 0.9.0
numbagg: None
setuptools: 40.6.2
pip: 18.1
conda: None
pytest: 4.0.0
IPython: 7.1.1
sphinx: 1.8.2
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3168/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
547523622,MDU6SXNzdWU1NDc1MjM2MjI=,3676,Merging dataArray into dataset using dataset method fails,35968931,closed,0,,,0,2020-01-09T14:46:49Z,2020-01-12T13:04:02Z,2020-01-12T13:04:02Z,MEMBER,,,,"While it's possible to merge a dataset and a dataarray object using the top-level `merge()` function, if you try the same thing with the `ds.merge()` method it fails.

```python
import xarray as xr

ds = xr.Dataset({'a': 0})
da = xr.DataArray(1, name='b')

expected = xr.merge([ds, da])  # works fine
print(expected)

ds.merge(da)  # fails
```

Output:
```
<xarray.Dataset>
Dimensions:  ()
Data variables:
    a        int64 0
    b        int64 1

Traceback (most recent call last):
  File ""mwe.py"", line 6, in <module>
    actual = ds.merge(da)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/dataset.py"", line 3591, in merge
    fill_value=fill_value,
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/merge.py"", line 835, in dataset_merge_method
    objs, compat, join, priority_arg=priority_arg, fill_value=fill_value
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/merge.py"", line 548, in merge_core
    coerced = coerce_pandas_values(objects)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/merge.py"", line 394, in coerce_pandas_values
    for k, v in obj.items():
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/common.py"", line 233, in __getattr__
    ""{!r} object has no attribute {!r}"".format(type(self).__name__, name)
AttributeError: 'DataArray' object has no attribute 'items'
```

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3676/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
497184021,MDU6SXNzdWU0OTcxODQwMjE=,3334,plot.line fails when plot axis is a 1D coordinate,35968931,closed,0,,,3,2019-09-23T15:52:48Z,2019-09-26T08:51:59Z,2019-09-26T08:51:59Z,MEMBER,,,,"#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a ""Minimal, Complete and Verifiable Example"" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
import xarray as xr
import numpy as np

x_coord = xr.DataArray(data=[0.1, 0.2], dims=['x'])
t_coord = xr.DataArray(data=[10, 20], dims=['t'])

da = xr.DataArray(data=np.array([[0, 1], [5, 9]]), dims=['x', 't'],
                  coords={'x': x_coord, 'time': t_coord})
print(da)

da.transpose('time', 'x')
```
Output:
```
<xarray.DataArray (x: 2, t: 2)>
array([[0, 1],
       [5, 9]])
Coordinates:
  * x        (x) float64 0.1 0.2
    time     (t) int64 10 20

Traceback (most recent call last):
  File ""mwe.py"", line 22, in <module>
    da.transpose('time', 'x')
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/dataarray.py"", line 1877, in transpose
    ""permuted array dimensions (%s)"" % (dims, tuple(self.dims))
ValueError: arguments to transpose (('time', 'x')) must be permuted array dimensions (('x', 't'))
```

As `'time'` is a coordinate with only one dimension, this is an unambiguous operation that I want to perform. However, because `.transpose()` currently only accepts dimensions, this fails with that error.


This causes bug in other parts of the code - for example I found this by trying to plot this type of dataarray:
```python
da.plot(x='time', hue='x')
```
which gives the same error. 

(You can get a similar error also with `da.plot(y='time', hue='x')`.)

If the [code which explicitly checks](https://github.com/pydata/xarray/pull/2556/files?file-filters%5B%5D=.py#diff-ffd3597671590bab245b3193affa62b8R1437) that the arguments to transpose are dims and not just coordinate dimensions is removed, then both of these examples work as expected.

I would like to generalise the transpose function to also accept dimension coordinates, is there any reason not to do this?

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3334/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
324350248,MDU6SXNzdWUzMjQzNTAyNDg=,2159,Concatenate across multiple dimensions with open_mfdataset,35968931,closed,0,,,27,2018-05-18T10:10:49Z,2019-09-16T18:54:39Z,2019-06-25T15:50:33Z,MEMBER,,,,"#### Code Sample

```python
# Create 4 datasets containing sections of contiguous (x,y) data
for i, x in enumerate([1, 3]):
    for j, y in enumerate([10, 40]):
        ds = xr.Dataset({'foo': (('x', 'y'), np.ones((2, 3)))},
                         coords={'x': [x, x+1],
                                 'y': [y, y+10, y+20]})

        ds.to_netcdf('ds.' + str(i) + str(j) + '.nc')

# Try to open them all in one go
ds_read = xr.open_mfdataset('ds.*.nc')
print(ds_read)
```
#### Problem description

Currently ``xr.open_mfdataset`` will detect a single common dimension and concatenate DataSets along that dimension. However a common use case is a set of NetCDF files which have two or more common dimensions that need to be concatenated along simultaneously (for example collecting the output of any large-scale simulation which parallelizes in more than one dimension simultaneously). For the behaviour of ``xr.open_mfdataset`` to be n-dimensional it should automatically recognise and concatenate along all common dimensions.

#### Expected Output
```
<xarray.Dataset>
Dimensions:  (x: 4, y: 6)
Coordinates:
  * x        (x) int64 1 2 3 4
  * y        (y) int64 10 20 30 40 50 60
Data variables:
    foo      (x, y) float64 dask.array<shape=(4, 6), chunksize=(2, 3)>
```

#### Current output of ``xr.open_mfdataset()``
```
<xarray.Dataset>
Dimensions:  (x: 4, y: 12)
Coordinates:
  * x        (x) int64 1 2 3 4
  * y        (y) int64 10 20 30 40 50 60 10 20 30 40 50 60
Data variables:
    foo      (x, y) float64 dask.array<shape=(4, 12), chunksize=(4, 3)>
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2159/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
463096652,MDU6SXNzdWU0NjMwOTY2NTI=,3073,Accidentally left a print statement,35968931,closed,0,,,0,2019-07-02T08:38:40Z,2019-07-02T14:16:43Z,2019-07-02T14:16:43Z,MEMBER,,,,"Somehow a rogue debugging print statement managed to sneak through to master in #2616!

Line 121 of combine.py https://github.com/pydata/xarray/blob/e2c2264833ce7e861bbb930be44356e1510e13c3/xarray/core/combine.py#L121  should be deleted.

@shoyer @dcherian","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3073/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
409854736,MDU6SXNzdWU0MDk4NTQ3MzY=,2768,[Bug] Reduce fails when no axis given,35968931,closed,0,,,1,2019-02-13T15:16:45Z,2019-02-19T06:13:00Z,2019-02-19T06:12:59Z,MEMBER,,,,"`DataArray.reduce()` fails if you try to reduce using a function which doesn't accept any axis arguments.

```python
import numpy as np
from xarray import DataArray

da = DataArray(np.array([[1, 3, 3], [2, 1, 5]]))

def total_sum(data):
    return np.sum(data.flatten())

sum = da.reduce(total_sum)
print(sum)
```

This should print a dataarray with just the number 15 in it, but instead it throws the error
```
Traceback (most recent call last):
  File ""mwe.py"", line 9, in <module>
    sum = da.reduce(total_sum)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/dataarray.py"", line 1605, in reduce
    var = self.variable.reduce(func, dim, axis, keep_attrs, **kwargs)
  File ""/home/tegn500/Documents/Work/Code/xarray/xarray/core/variable.py"", line 1365, in reduce
    axis=axis, **kwargs)
TypeError: total_sum() got an unexpected keyword argument 'axis'
```

This contradicts what the docstring of `.reduce()` says:
```
axis: int or sequence of int, optional
    Axis(es) over which to repeatedly apply func. Only one of the ‘dim’ and ‘axis’ 
    arguments can be supplied. If neither are supplied, then the reduction is 
    calculated over the flattened array (by calling f(x) without an axis argument).
```

The problem is that in `variable.py` an `axis=None` kwarg is always passed to func, even if no axis argument is given by the user in `reduce`.  ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2768/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
404383025,MDU6SXNzdWU0MDQzODMwMjU=,2725,Line plot with x=coord putting wrong variables on axes,35968931,closed,0,,,3,2019-01-29T16:43:18Z,2019-01-30T02:02:22Z,2019-01-30T02:02:22Z,MEMBER,,,,"When I try to plot the values in a 1D DataArray against the values in one of its coordinates, it does not behave at all as expected: 

```python
import numpy as np
import matplotlib.pyplot as plt
from xarray import DataArray

current = DataArray(name='current', data=np.array([5, 8, 14, 22, 30]), dims=['time'],
                    coords={'time': (['time'], np.array([0.1, 0.2, 0.3, 0.4, 0.5])),
                            'voltage': (['time'], np.array([100, 200, 300, 400, 500]))})

print(current)

# Try to plot current against voltage
current.plot.line(x='voltage')
plt.show()
```
Output:

```
<xarray.DataArray 'current' (time: 5)>
array([ 5,  8, 14, 22, 30])
Coordinates:
  * time     (time) float64 0.1 0.2 0.3 0.4 0.5
    voltage  (time) int64 100 200 300 400 500
```
![incorrect_current_plot](https://user-images.githubusercontent.com/35968931/51924149-683f3800-23e4-11e9-8957-81d32da43117.png)

#### Problem description

Not only is `'voltage'` not on the x axis, but `'current'` isn't on the y axis either!

#### Expected Output

Based on the documentation (and common sense) I would have expected it to plot voltage on the x axis and current on the y axis.

(using a branch of xarray which is up-to-date with master)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2725/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
367763373,MDU6SXNzdWUzNjc3NjMzNzM=,2473,Recommended way to extend xarray Datasets using accessors?,35968931,closed,0,,,6,2018-10-08T12:19:21Z,2018-10-31T09:58:05Z,2018-10-31T09:58:05Z,MEMBER,,,,"Hi,

I'm now regularly using xarray (& dask) for organising and analysing the output of the simulation code I use ([BOUT++](https://boutproject.github.io/)) and it's very helpful, thank you!. 

However my current approach is quite clunky at dealing the extra information and functionality that's specific to the simulation code I'm using, and I have questions about what the recommended way to extend the xarray Dataset class is. This seems like a general enough problem that I thought I would make an issue for it.

### Desired

What I ideally want to do is extend the xarray.Dataset class to accommodate extra attributes and methods, while retaining as much xarray functionality as possible, but avoiding reimplementing any of the API. This might not be possible, but ideally I want to make a `BoutDataset` class which contains extra attributes to hold information about the run which doesn't naturally fit into the xarray data model, extra methods to perform analysis/plotting which only users of this code would require, but also be able to use xarray-specific methods and top-level functions:

```python
bd = BoutDataset('/path/to/data')

ds = bd.data  # access the wrapped xarray dataset
extra_data = bd.extra_data  # access the BOUT-specific data

bd.isel(time=-1)  # use xarray dataset methods

bd2 = BoutDataset('/path/to/other/data')
concatenated_bd = xr.concat([bd, bd2])  # apply top-level xarray functions to the data

bd.plot_tokamak()  # methods implementing bout-specific functionality
```

### Problems with my current approach

I have read the documentation about [extending xarray](http://xarray.pydata.org/en/stable/internals.html#extending-xarray), and the issue threads about subclassing Datasets (#706) and accessors (#1080), but I wanted to check that what I'm doing is the recommended approach.

Right now I'm [trying](https://github.com/TomNicholas/xcollect/blob/master/boutdataset.py) to do something like

```python
@xr.register_dataset_accessor('bout')
class BoutDataset:
    def __init__(self, path):
        self.data = collect_data(path)  # collect all my numerical data from output files
        self.extra_data = read_extra_data(path)  # collect extra data about the simulation 

    def plot_tokamak():
        plot_in_bout_specific_way(self.data, self.extra_data)
``` 

which works in the sense that I can do

```python
bd = BoutDataset('/path/to/data')

ds = bd.bout.data  # access the wrapped xarray dataset
extra_data = bd.bout.extra_data  # access the BOUT-specific data
bd.bout.plot_tokamak()  # methods implementing bout-specific functionality
```

but not so well with

```python
bd.isel(time=-1)  # AttributeError: 'BoutDataset' object has no attribute 'isel'
bd.bout.data.isel(time=-1)  # have to do this instead, but this returns an xr.Dataset not a BoutDataset

concatenated_bd = xr.concat([bd1, bd2])  # TypeError: can only concatenate xarray Dataset and DataArray objects, got <class 'BoutDataset'>
concatenated_ds = xr.concat([bd1.bout.data, bd2.bout.data])  # again have to do this instead, which again returns an xr.Dataset not a BoutDataset
```

If I have to reimplement the APl for methods like `.isel()` and top-level functions like `concat()`, then why should I not just subclass `xr.Dataset`? 

There aren't very many top-level xarray functions so reimplementing them would be okay, but there are loads of Dataset methods. However I think I know how I want my `BoutDataset` class to behave when an `xr.Dataset` method is called on it: I want it to implement that method on the underlying dataset and return the full BoutDatset with extra data and attributes still attached.

Is it possible to do something like:
""if calling an `xr.Dataset` method on an instance of `BoutDataset`, call the corresponding method on the wrapped dataset and return a BoutDataset that has the extra BOUT-specific data propagated through""?

Thanks in advance, apologies if this is either impossible or relatively trivial, I just thought other xarray users might have the same questions.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2473/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
354923742,MDU6SXNzdWUzNTQ5MjM3NDI=,2388,Test equality of DataArrays up to transposition,35968931,closed,0,,,2,2018-08-28T22:13:01Z,2018-10-08T12:25:46Z,2018-10-08T12:25:46Z,MEMBER,,,,"While writing some unit tests to check I had wrapped `np.gradient` correctly with `xr.apply_ufunc`, I came unstuck because my results were equivalent except for transposed dimensions. It took me a while to realise that `xarray.testing.asset_equal` considers two DataArrays equal only if their dimensions are in the same order, because intuitively that shouldn't matter in the context of xarray's data model.


A simple example to demonstrate what I mean:
```python
# Create two functionally-equivalent dataarrays
data = np.random.randn(4, 3)
da1 = xr.DataArray(data, dims=('x', 'y'))
da2 = xr.DataArray(data.T, dims=('y', 'x'))

# This test will fail
xarray.tests.assert_equal(da1, da2)
```
This test fails, with output
```
E       AssertionError: <xarray.DataArray (x: 4, y: 3)>
E       array([[ 0.761038,  0.121675,  0.443863],
E              [ 0.333674,  1.494079, -0.205158],
E              [ 0.313068, -0.854096, -2.55299 ],
E              [ 0.653619,  0.864436, -0.742165]])
E       Coordinates:
E         * x        (x) int64 5 7 9 11
E         * y        (y) int64 1 4 6
E       <xarray.DataArray (y: 3, x: 4)>
E       array([[ 0.761038,  0.333674,  0.313068,  0.653619],
E              [ 0.121675,  1.494079, -0.854096,  0.864436],
E              [ 0.443863, -0.205158, -2.55299 , -0.742165]])
E       Coordinates:
E         * x        (x) int64 5 7 9 11
E         * y        (y) int64 1 4 6
```
even though these two DataArrays are functionally-equivalent for all xarray operations you could perform with them.

It would make certain types of unit tests simpler and clearer to have a function like
```python
xarray.tests.assert_equivalent(da1, da2)
```
which would return true if one DataArray can be formed from the other by transposition.

I would have thought that a test that does this would just transpose one into the shape of the other before comparison?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2388/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue