id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
2075019328,PR_kwDOAMm_X85juCQ-,8603,Convert 360_day calendars by choosing random dates to drop or add,20629530,closed,0,,,3,2024-01-10T19:13:31Z,2024-04-16T14:53:42Z,2024-04-16T14:53:42Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8603,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Tests added
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`

Small PR to add a new ""method"" to convert to and from 360_day calendars. The current two methods (chosen with the `align_on` keyword) will always remove or add the same day-of-year for all years of the same length. 

This new option will randomly chose the days, one for each fifth of the year (72-days period). It emulates the method of the LOCA datasets (see [web page](https://loca.ucsd.edu/loca-calendar/) and [article](https://journals.ametsoc.org/view/journals/hydr/15/6/jhm-d-14-0082_1.xml) ). February 29th is always removed/added when the source/target is a leap year.

I copied the implementation from xclim (which I wrote), [see code here](https://github.com/Ouranosinc/xclim/blob/fb29b8a8e400c7d8aaf4e1233a6b37a300126257/xclim/core/calendar.py#L112-L134) .
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8603/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
1831975171,I_kwDOAMm_X85tMbkD,8039,Update assign_coords with a MultiIndex to match new Coordinates API,20629530,closed,0,,,11,2023-08-01T20:22:41Z,2023-08-29T14:23:30Z,2023-08-29T14:23:30Z,CONTRIBUTOR,,,,"### What is your issue?

A pattern we used in `xclim` (and elsewhere) seems to be broken on the master.

See MWE:
```python3
import pandas as pd
import xarray as xr

da = xr.DataArray([1] * 730, coords={""time"": xr.date_range('1900-01-01', periods=730, freq='D', calendar='noleap')})
mulind = pd.MultiIndex.from_arrays((da.time.dt.year.values, da.time.dt.dayofyear.values), names=('year', 'doy'))

# Override previous time axis with new MultiIndex
da.assign_coords(time=mulind).unstack('time')
```

Now this works ok with both the current master and the latest release. However, if we chunk `da`, the last line now fails:
```python
da.chunk(time=50).assign_coords(time=mulind).unstack('time')
```
On the master, this gives: `ValueError: unmatched keys found in indexes and variables: {'year', 'doy'}`

Full traceback:

<details>
 
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[44], line 1
----> 1 da.chunk(time=50).assign_coords(time=mulind).unstack(""time"")

File ~/Projets/xarray/xarray/core/dataarray.py:2868, in DataArray.unstack(self, dim, fill_value, sparse)
   2808 def unstack(
   2809     self,
   2810     dim: Dims = None,
   2811     fill_value: Any = dtypes.NA,
   2812     sparse: bool = False,
   2813 ) -> DataArray:
   2814     """"""
   2815     Unstack existing dimensions corresponding to MultiIndexes into
   2816     multiple new dimensions.
   (...)
   2866     DataArray.stack
   2867     """"""
-> 2868     ds = self._to_temp_dataset().unstack(dim, fill_value, sparse)
   2869     return self._from_temp_dataset(ds)

File ~/Projets/xarray/xarray/core/dataset.py:5481, in Dataset.unstack(self, dim, fill_value, sparse)
   5479 for d in dims:
   5480     if needs_full_reindex:
-> 5481         result = result._unstack_full_reindex(
   5482             d, stacked_indexes[d], fill_value, sparse
   5483         )
   5484     else:
   5485         result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse)

File ~/Projets/xarray/xarray/core/dataset.py:5365, in Dataset._unstack_full_reindex(self, dim, index_and_vars, fill_value, sparse)
   5362 else:
   5363     # TODO: we may depreciate implicit re-indexing with a pandas.MultiIndex
   5364     xr_full_idx = PandasMultiIndex(full_idx, dim)
-> 5365     indexers = Indexes(
   5366         {k: xr_full_idx for k in index_vars},
   5367         xr_full_idx.create_variables(index_vars),
   5368     )
   5369     obj = self._reindex(
   5370         indexers, copy=False, fill_value=fill_value, sparse=sparse
   5371     )
   5373 for name, var in obj.variables.items():

File ~/Projets/xarray/xarray/core/indexes.py:1435, in Indexes.__init__(self, indexes, variables, index_type)
   1433 unmatched_keys = set(indexes) ^ set(variables)
   1434 if unmatched_keys:
-> 1435     raise ValueError(
   1436         f""unmatched keys found in indexes and variables: {unmatched_keys}""
   1437     )
   1439 if any(not isinstance(idx, index_type) for idx in indexes.values()):
   1440     index_type_str = f""{index_type.__module__}.{index_type.__name__}""

ValueError: unmatched keys found in indexes and variables: {'year', 'doy'}

</details>

This seems related to PR #7368.

The reason for the title of this issue is that in both versions, I now realize the `da.assign_coords(time=mulind)` prints as:
```
<xarray.DataArray (time: 730)>
dask.array<xarray-<this-array>, shape=(730,), dtype=int64, chunksize=(50,), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) object MultiIndex
```
Something's fishy, because the two ""sub"" indexes are not showing.

And indeed, with the current master, I can get this to work by doing (again changing the last line):
```python
da2 = xr.DataArray(da.data, coords=xr.Coordinates.from_pandas_multiindex(mulind, 'time'))
da2.chunk(time=50).unstack('time')
```
But it seems a bit odd to me that we need to reconstruct the DataArray to replace its coordinate with a ""MultiIndex"" one.

Thus, my questions are:

1. How does one properly _override_ a coordinate by a MultiIndex ? Is there a way to use `assign_coords` ? If not, then this issue would become a feature request.
2. Is this a regression ? Or was I just ""lucky"" before ?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8039/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1442443970,I_kwDOAMm_X85V-fLC,7275,REG: `nc_time_axis` not imported anymore,20629530,closed,0,,,1,2022-11-09T17:02:59Z,2022-11-10T21:45:28Z,2022-11-10T21:45:28Z,CONTRIBUTOR,,,,"### What happened?

With xarray 2022.11.0, plotting a DataArray with a `cftime` time axis fails. 

It fails with a matplotlib error : `TypeError: float() argument must be a string or a real number, not 'cftime._cftime.DatetimeNoLeap'`

### What did you expect to happen?

With previous versions of xarray, the `nc_time_axis` package was imported by xarray and these errors were avoided.

### Minimal Complete Verifiable Example

```Python
import xarray as xr
da = xr.DataArray(
    list(range(10)),
    dims=('time',),
    coords={'time': xr.cftime_range('1900-01-01', periods=10, calendar='noleap', freq='D')}
)
da.plot()
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

```Python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [1], line 7
      1 import xarray as xr
      2 da = xr.DataArray(
      3     list(range(10)),
      4     dims=('time',),
      5     coords={'time': xr.cftime_range('1900-01-01', periods=10, calendar='noleap', freq='D')}
      6 )
----> 7 da.plot()

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/xarray/plot/accessor.py:46, in DataArrayPlotAccessor.__call__(self, **kwargs)
     44 @functools.wraps(dataarray_plot.plot, assigned=(""__doc__"", ""__annotations__""))
     45 def __call__(self, **kwargs) -> Any:
---> 46     return dataarray_plot.plot(self._da, **kwargs)

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/xarray/plot/dataarray_plot.py:312, in plot(darray, row, col, col_wrap, ax, hue, subplot_kws, **kwargs)
    308     plotfunc = hist
    310 kwargs[""ax""] = ax
--> 312 return plotfunc(darray, **kwargs)

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/xarray/plot/dataarray_plot.py:517, in line(darray, row, col, figsize, aspect, size, ax, hue, x, y, xincrease, yincrease, xscale, yscale, xticks, yticks, xlim, ylim, add_legend, _labels, *args, **kwargs)
    513 ylabel = label_from_attrs(yplt, extra=y_suffix)
    515 _ensure_plottable(xplt_val, yplt_val)
--> 517 primitive = ax.plot(xplt_val, yplt_val, *args, **kwargs)
    519 if _labels:
    520     if xlabel is not None:

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/axes/_axes.py:1664, in Axes.plot(self, scalex, scaley, data, *args, **kwargs)
   1662 lines = [*self._get_lines(*args, data=data, **kwargs)]
   1663 for line in lines:
-> 1664     self.add_line(line)
   1665 if scalex:
   1666     self._request_autoscale_view(""x"")

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/axes/_base.py:2340, in _AxesBase.add_line(self, line)
   2337 if line.get_clip_path() is None:
   2338     line.set_clip_path(self.patch)
-> 2340 self._update_line_limits(line)
   2341 if not line.get_label():
   2342     line.set_label(f'_child{len(self._children)}')

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/axes/_base.py:2363, in _AxesBase._update_line_limits(self, line)
   2359 def _update_line_limits(self, line):
   2360     """"""
   2361     Figures out the data limit of the given line, updating self.dataLim.
   2362     """"""
-> 2363     path = line.get_path()
   2364     if path.vertices.size == 0:
   2365         return

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/lines.py:1031, in Line2D.get_path(self)
   1029 """"""Return the `~matplotlib.path.Path` associated with this line.""""""
   1030 if self._invalidy or self._invalidx:
-> 1031     self.recache()
   1032 return self._path

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/lines.py:659, in Line2D.recache(self, always)
    657 if always or self._invalidx:
    658     xconv = self.convert_xunits(self._xorig)
--> 659     x = _to_unmasked_float_array(xconv).ravel()
    660 else:
    661     x = self._x

File ~/mambaforge/envs/xclim/lib/python3.10/site-packages/matplotlib/cbook/__init__.py:1369, in _to_unmasked_float_array(x)
   1367     return np.ma.asarray(x, float).filled(np.nan)
   1368 else:
-> 1369     return np.asarray(x, float)

TypeError: float() argument must be a string or a real number, not 'cftime._cftime.DatetimeNoLeap'
```


### Anything else we need to know?

I suspect #7179.

This line:
https://github.com/pydata/xarray/blob/cc7e09a3507fa342b3790b5c109e700fa12f0b17/xarray/plot/utils.py#L27
does _not_ import `nc_time_axis`. Further down, the variable gets checked and if `False` an error is raised, but if the package still is not imported if `True`.

Previously we had:
https://github.com/pydata/xarray/blob/fc9026b59d38146a21769cc2d3026a12d58af059/xarray/plot/utils.py#L27-L32
where the package is always imported.

Maybe there's a way to import `nc_time_axis` only when needed?

### Environment

<details>


INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:36:39) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 6.0.5-200.fc36.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_CA.UTF-8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1

xarray: 2022.11.0
pandas: 1.5.1
numpy: 1.23.4
scipy: 1.8.1
netCDF4: 1.6.1
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.10.2
distributed: 2022.10.2
matplotlib: 3.6.2
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.10.0
cupy: None
pint: 0.20.1
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.5.1
pip: 22.3.1
conda: None
pytest: 7.2.0
IPython: 8.6.0
sphinx: 5.3.0


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7275/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1347026292,I_kwDOAMm_X85QSf10,6946,reset_index not resetting levels of MultiIndex,20629530,closed,0,4160723,,3,2022-08-22T21:47:04Z,2022-09-27T10:35:39Z,2022-09-27T10:35:39Z,CONTRIBUTOR,,,,"### What happened?

I'm not sure my usecase is the simplest way to demonstrate the issue, but let's try anyway.

I have a DataArray with two coordinates and I stack them into a new multi-index. I want to pass the levels of that new multi-index into a function, but as dask arrays.  Turns out, it is not straightforward to chunk these variables because they act like `IndexVariable` objects and refuse to be chunked.

Thus, I reset the multi-index, drop it, but the variables still don't want to be chunked!

### What did you expect to happen?

I expected the levels to be chunkable after the sequence : stack, reset_index.

### Minimal Complete Verifiable Example

```Python
import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature')

ds = ds.stack(spatial=['lon', 'lat'])
ds = ds.reset_index('spatial', drop=True)  # I don't think the drop is important here.
lon_chunked = ds.lon.chunk() # woups, doesn't do anything!

type(ds.lon.variable) # xarray.core.variable.IndexVariable  # I assumed either the stack or the reset_index would have modified this type into a normal variable.
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

Seems kinda related to the issues around `reset_index`. I thinks this is related to (but not a duplicate of) #4366.

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.10.5 | packaged by conda-forge | (main, Jun 14 2022, 07:04:59) [GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.49.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 2022.6.0
pandas: 1.4.3
numpy: 1.22.4
scipy: 1.9.0
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: 3.7.0
Nio: None
zarr: 2.12.0
cftime: 1.6.1
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.5
dask: 2022.8.0
distributed: 2022.8.0
matplotlib: 3.5.2
cartopy: 0.20.3
seaborn: None
numbagg: None
fsspec: 2022.7.1
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: 0.5.9
numpy_groupies: 0.9.19
setuptools: 63.4.2
pip: 22.2.2
conda: None
pytest: None
IPython: 8.4.0
sphinx: 5.1.1

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6946/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1235725650,I_kwDOAMm_X85Jp61S,6607,Coordinate promotion workaround broken,20629530,closed,0,4160723,,4,2022-05-13T21:20:25Z,2022-09-27T09:33:41Z,2022-09-27T09:33:41Z,CONTRIBUTOR,,,,"### What happened?

Ok so this one is a bit weird. I'm not sure this is a bug, but code that worked before doesn't anymore, so it is some sort of regression.

I have a dataset with one dimension and one coordinate along that one, but they have different names. I want to transform this so that the coordinate name becomes the dimension name so it becomes are proper dimension-coordinate (I don't know how to call it).  After renaming the dim to the coord's name, it all looks good in the repr, but the coord still is missing an `index` for that dimension (`crd.indexes` is empty, see MCVE). There was a workaround through `reset_coords` for this, but it doesn't work anymore.

Instead, the last line of the MCVE downgrades the variable, the final `lon` doesn't have coords anymore. 

### What did you expect to happen?

In the MCVE below, I show what the old ""workaround"" was. I expected `lon.indexes` to contain the indexes `lon` at the end of the procedure. 

### Minimal Complete Verifiable Example

```Python
import xarray as xr

# A dataset with a 1d variable along a dimension
ds = xr.Dataset({'lon': xr.DataArray([1, 2, 3], dims=('x',))})

# Promote to coord. This still is not a proper crd-dim (different name)
ds = ds.set_coords(['lon'])

# Rename dim:
ds = ds.rename(x='lon')

# Now do we have a proper coord-dim ? No. not yet because:
ds.indexes # is empty

# Workaround that was used up to the last release
lon = ds.lon.reset_coords(drop=True)

# Because of the missing indexes the next line fails on the master
lon - lon.diff('lon')
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

My guess is that this line is causing `reset_coords` to drop the coordinate from itself : https://github.com/pydata/xarray/blob/c34ef8a60227720724e90aa11a6266c0026a812a/xarray/core/dataarray.py#L866

It would be nice if the renaming was sufficient for the indexes to appear. 

My example is weird I know. The real use case is a script where we receive a 2d coordinate but where all lines are the same, so we take the first line and promote it to a proper coord-dim. But the current code fails on the master on the `lon - lon.diff('lon')` step that happens afterwards.

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:22:55) 
[GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 5.13.19-2-MANJARO
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.UTF-8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: None
libnetcdf: None

xarray: 2022.3.1.dev104+gc34ef8a6
pandas: 1.4.2
numpy: 1.22.2
scipy: 1.8.0
netCDF4: None
pydap: installed
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.5.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.02.1
distributed: 2022.2.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.3.0
cupy: None
pint: None
sparse: 0.13.0
setuptools: 59.8.0
pip: 22.0.3
conda: None
pytest: 7.0.1
IPython: 8.3.0
sphinx: None


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6607/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1237552666,I_kwDOAMm_X85Jw44a,6613,Flox can't handle cftime objects,20629530,closed,0,,,2,2022-05-16T18:35:56Z,2022-06-02T23:23:20Z,2022-06-02T23:23:20Z,CONTRIBUTOR,,,,"### What happened?

I use resampling to count the number of timesteps within time periods. So the simple way is to : `da.time.resample(time='YS').count()`. With the current master, a non-standard calendar and with  `flox`installed, this fails : `flox` can't handle the cftime objects of the time coordinate.

### What did you expect to happen?

I expected the count of elements for each period to be returned.

### Minimal Complete Verifiable Example

```Python
import xarray as xr

timeNP = xr.DataArray(xr.date_range('2009-01-01', '2012-12-31', use_cftime=False), dims=('time',), name='time')

timeCF = xr.DataArray(xr.date_range('2009-01-01', '2012-12-31', use_cftime=True), dims=('time',), name='time')

timeNP.resample(time='YS').count() # works

timeCF.resample(time='YS').count() # Fails
```


### MVCE confirmation

- [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [x] Complete example — the example is self-contained, including all data and the text of any traceback.
- [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

```Python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 a.resample(time='YS').count()

File ~/Python/myxarray/xarray/core/_reductions.py:5456, in DataArrayResampleReductions.count(self, dim, keep_attrs, **kwargs)
   5401 """"""
   5402 Reduce this DataArray's data by applying ``count`` along some dimension(s).
   5403 
   (...)
   5453   * time     (time) datetime64[ns] 2001-01-31 2001-04-30 2001-07-31
   5454 """"""
   5455 if flox and OPTIONS[""use_flox""] and contains_only_dask_or_numpy(self._obj):
-> 5456     return self._flox_reduce(
   5457         func=""count"",
   5458         dim=dim,
   5459         # fill_value=fill_value,
   5460         keep_attrs=keep_attrs,
   5461         **kwargs,
   5462     )
   5463 else:
   5464     return self.reduce(
   5465         duck_array_ops.count,
   5466         dim=dim,
   5467         keep_attrs=keep_attrs,
   5468         **kwargs,
   5469     )

File ~/Python/myxarray/xarray/core/resample.py:44, in Resample._flox_reduce(self, dim, **kwargs)
     41 labels = np.repeat(self._unique_coord.data, repeats)
     42 group = DataArray(labels, dims=(self._group_dim,), name=self._unique_coord.name)
---> 44 result = super()._flox_reduce(dim=dim, group=group, **kwargs)
     45 result = self._maybe_restore_empty_groups(result)
     46 result = result.rename({RESAMPLE_DIM: self._group_dim})

File ~/Python/myxarray/xarray/core/groupby.py:661, in GroupBy._flox_reduce(self, dim, **kwargs)
    658     expected_groups = (self._unique_coord.values,)
    659     isbin = False
--> 661 result = xarray_reduce(
    662     self._original_obj.drop_vars(non_numeric),
    663     group,
    664     dim=dim,
    665     expected_groups=expected_groups,
    666     isbin=isbin,
    667     **kwargs,
    668 )
    670 # Ignore error when the groupby reduction is effectively
    671 # a reduction of the underlying dataset
    672 result = result.drop_vars(unindexed_dims, errors=""ignore"")

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/xarray.py:308, in xarray_reduce(obj, func, expected_groups, isbin, sort, dim, split_out, fill_value, method, engine, keep_attrs, skipna, min_count, reindex, *by, **finalize_kwargs)
    305 input_core_dims = _get_input_core_dims(group_names, dim, ds, grouper_dims)
    306 input_core_dims += [input_core_dims[-1]] * (len(by) - 1)
--> 308 actual = xr.apply_ufunc(
    309     wrapper,
    310     ds.drop_vars(tuple(missing_dim)).transpose(..., *grouper_dims),
    311     *by,
    312     input_core_dims=input_core_dims,
    313     # for xarray's test_groupby_duplicate_coordinate_labels
    314     exclude_dims=set(dim),
    315     output_core_dims=[group_names],
    316     dask=""allowed"",
    317     dask_gufunc_kwargs=dict(output_sizes=group_sizes),
    318     keep_attrs=keep_attrs,
    319     kwargs={
    320         ""func"": func,
    321         ""axis"": axis,
    322         ""sort"": sort,
    323         ""split_out"": split_out,
    324         ""fill_value"": fill_value,
    325         ""method"": method,
    326         ""min_count"": min_count,
    327         ""skipna"": skipna,
    328         ""engine"": engine,
    329         ""reindex"": reindex,
    330         ""expected_groups"": tuple(expected_groups),
    331         ""isbin"": isbin,
    332         ""finalize_kwargs"": finalize_kwargs,
    333     },
    334 )
    336 # restore non-dim coord variables without the core dimension
    337 # TODO: shouldn't apply_ufunc handle this?
    338 for var in set(ds.variables) - set(ds.dims):

File ~/Python/myxarray/xarray/core/computation.py:1170, in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, *args)
   1168 # feed datasets apply_variable_ufunc through apply_dataset_vfunc
   1169 elif any(is_dict_like(a) for a in args):
-> 1170     return apply_dataset_vfunc(
   1171         variables_vfunc,
   1172         *args,
   1173         signature=signature,
   1174         join=join,
   1175         exclude_dims=exclude_dims,
   1176         dataset_join=dataset_join,
   1177         fill_value=dataset_fill_value,
   1178         keep_attrs=keep_attrs,
   1179     )
   1180 # feed DataArray apply_variable_ufunc through apply_dataarray_vfunc
   1181 elif any(isinstance(a, DataArray) for a in args):

File ~/Python/myxarray/xarray/core/computation.py:460, in apply_dataset_vfunc(func, signature, join, dataset_join, fill_value, exclude_dims, keep_attrs, *args)
    455 list_of_coords, list_of_indexes = build_output_coords_and_indexes(
    456     args, signature, exclude_dims, combine_attrs=keep_attrs
    457 )
    458 args = [getattr(arg, ""data_vars"", arg) for arg in args]
--> 460 result_vars = apply_dict_of_variables_vfunc(
    461     func, *args, signature=signature, join=dataset_join, fill_value=fill_value
    462 )
    464 if signature.num_outputs > 1:
    465     out = tuple(
    466         _fast_dataset(*args)
    467         for args in zip(result_vars, list_of_coords, list_of_indexes)
    468     )

File ~/Python/myxarray/xarray/core/computation.py:402, in apply_dict_of_variables_vfunc(func, signature, join, fill_value, *args)
    400 result_vars = {}
    401 for name, variable_args in zip(names, grouped_by_name):
--> 402     result_vars[name] = func(*variable_args)
    404 if signature.num_outputs > 1:
    405     return _unpack_dict_tuples(result_vars, signature.num_outputs)

File ~/Python/myxarray/xarray/core/computation.py:750, in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
    745     if vectorize:
    746         func = _vectorize(
    747             func, signature, output_dtypes=output_dtypes, exclude_dims=exclude_dims
    748         )
--> 750 result_data = func(*input_data)
    752 if signature.num_outputs == 1:
    753     result_data = (result_data,)

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/xarray.py:291, in xarray_reduce.<locals>.wrapper(array, func, skipna, *by, **kwargs)
    288     if ""nan"" not in func and func not in [""all"", ""any"", ""count""]:
    289         func = f""nan{func}""
--> 291 result, *groups = groupby_reduce(array, *by, func=func, **kwargs)
    292 return result

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:1553, in groupby_reduce(array, func, expected_groups, sort, isbin, axis, fill_value, min_count, split_out, method, engine, reindex, finalize_kwargs, *by)
   1550 agg = _initialize_aggregation(func, array.dtype, fill_value, min_count, finalize_kwargs)
   1552 if not has_dask:
-> 1553     results = _reduce_blockwise(
   1554         array, by, agg, expected_groups=expected_groups, reindex=reindex, **kwargs
   1555     )
   1556     groups = (results[""groups""],)
   1557     result = results[agg.name]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:1008, in _reduce_blockwise(array, by, agg, axis, expected_groups, fill_value, engine, sort, reindex)
   1005     finalize_kwargs = (finalize_kwargs,)
   1006 finalize_kwargs = finalize_kwargs + ({},) + ({},)
-> 1008 results = chunk_reduce(
   1009     array,
   1010     by,
   1011     func=agg.numpy,
   1012     axis=axis,
   1013     expected_groups=expected_groups,
   1014     # This fill_value should only apply to groups that only contain NaN observations
   1015     # BUT there is funkiness when axis is a subset of all possible values
   1016     # (see below)
   1017     fill_value=agg.fill_value[""numpy""],
   1018     dtype=agg.dtype[""numpy""],
   1019     kwargs=finalize_kwargs,
   1020     engine=engine,
   1021     sort=sort,
   1022     reindex=reindex,
   1023 )  # type: ignore
   1025 if _is_arg_reduction(agg):
   1026     results[""intermediates""][0] = np.unravel_index(results[""intermediates""][0], array.shape)[-1]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:677, in chunk_reduce(array, by, func, expected_groups, axis, fill_value, dtype, reindex, engine, kwargs, sort)
    675     result = reduction(group_idx, array, **kwargs)
    676 else:
--> 677     result = generic_aggregate(
    678         group_idx, array, axis=-1, engine=engine, func=reduction, **kwargs
    679     ).astype(dt, copy=False)
    680 if np.any(props.nanmask):
    681     # remove NaN group label which should be last
    682     result = result[..., :-1]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/aggregations.py:49, in generic_aggregate(group_idx, array, engine, func, axis, size, fill_value, dtype, **kwargs)
     44 else:
     45     raise ValueError(
     46         f""Expected engine to be one of ['flox', 'numpy', 'numba']. Received {engine} instead.""
     47     )
---> 49 return method(
     50     group_idx, array, axis=axis, size=size, fill_value=fill_value, dtype=dtype, **kwargs
     51 )

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/aggregate_flox.py:86, in nanlen(group_idx, array, *args, **kwargs)
     85 def nanlen(group_idx, array, *args, **kwargs):
---> 86     return sum(group_idx, (~np.isnan(array)).astype(int), *args, **kwargs)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
```


### Anything else we need to know?

I was able to resolve this by modifying `xarray.core.utils.contains_only_dask_or_numpy` as to return False if the input's dtype is 'O'.  This check seems to only be used when choosing between `flox` and the old algos. Does this make sense?

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.17.5-arch1-2
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 2022.3.1.dev16+g3ead17ea
pandas: 1.4.2
numpy: 1.21.6
scipy: 1.7.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: 1.3.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2022.04.1
distributed: 2022.4.1
matplotlib: 3.4.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.07.0
cupy: None
pint: 0.18
sparse: None
flox: 0.5.1
numpy_groupies: 0.9.16
setuptools: 57.4.0
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 8.2.0
sphinx: 4.1.2


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6613/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1242388766,I_kwDOAMm_X85KDVke,6623,Cftime arrays not supported by polyval,20629530,closed,0,,,1,2022-05-19T22:19:14Z,2022-05-31T17:16:04Z,2022-05-31T17:16:04Z,CONTRIBUTOR,,,,"### What happened?

I was trying to use polyval with a cftime coordinate and it failed with `TypeError: unsupported operand type(s) for *: 'float' and 'cftime._cftime.DatetimeNoLeap'`. The error seems to originate from #6548, where the process transforming coordinates to numerical values was modified. The new `_ensure_numeric` method seems to ignore the possibility of `cftime` arrays.

### What did you expect to happen?

A polynomial to be evaluated along my coordinate.

### Minimal Complete Verifiable Example

```Python
import xarray as xr
import numpy as np

# use_cftime=False will work
t = xr.date_range('2001-01-01', periods=100, use_cftime=True, freq='YS')
da = xr.DataArray(np.arange(100) ** 3, dims=('time',), coords={'time': t})
coeffs = da.polyfit('time', 4)
da2 = xr.polyval(da.time, coeffs).polyfit_coefficients
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

```Python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [5], in <cell line: 4>()
      2 da = xr.DataArray(np.arange(100) ** 3, dims=('time',), coords={'time': t})
      3 coeffs = da.polyfit('time', 4)
----> 4 da2 = xr.polyval(da.time, coeffs).polyfit_coefficients

File ~/Python/xarray/xarray/core/computation.py:1931, in polyval(coord, coeffs, degree_dim)
   1929 res = zeros_like(coord) + coeffs.isel({degree_dim: max_deg}, drop=True)
   1930 for deg in range(max_deg - 1, -1, -1):
-> 1931     res *= coord
   1932     res += coeffs.isel({degree_dim: deg}, drop=True)
   1934 return res

File ~/Python/xarray/xarray/core/_typed_ops.py:103, in DatasetOpsMixin.__imul__(self, other)
    102 def __imul__(self, other):
--> 103     return self._inplace_binary_op(other, operator.imul)

File ~/Python/xarray/xarray/core/dataset.py:6107, in Dataset._inplace_binary_op(self, other, f)
   6105     other = other.reindex_like(self, copy=False)
   6106 g = ops.inplace_to_noninplace_op(f)
-> 6107 ds = self._calculate_binary_op(g, other, inplace=True)
   6108 self._replace_with_new_dims(
   6109     ds._variables,
   6110     ds._coord_names,
   (...)
   6113     inplace=True,
   6114 )
   6115 return self

File ~/Python/xarray/xarray/core/dataset.py:6154, in Dataset._calculate_binary_op(self, f, other, join, inplace)
   6152 else:
   6153     other_variable = getattr(other, ""variable"", other)
-> 6154     new_vars = {k: f(self.variables[k], other_variable) for k in self.data_vars}
   6155 ds._variables.update(new_vars)
   6156 ds._dims = calculate_dimensions(ds._variables)

File ~/Python/xarray/xarray/core/dataset.py:6154, in <dictcomp>(.0)
   6152 else:
   6153     other_variable = getattr(other, ""variable"", other)
-> 6154     new_vars = {k: f(self.variables[k], other_variable) for k in self.data_vars}
   6155 ds._variables.update(new_vars)
   6156 ds._dims = calculate_dimensions(ds._variables)

File ~/Python/xarray/xarray/core/_typed_ops.py:402, in VariableOpsMixin.__mul__(self, other)
    401 def __mul__(self, other):
--> 402     return self._binary_op(other, operator.mul)

File ~/Python/xarray/xarray/core/variable.py:2494, in Variable._binary_op(self, other, f, reflexive)
   2491 attrs = self._attrs if keep_attrs else None
   2492 with np.errstate(all=""ignore""):
   2493     new_data = (
-> 2494         f(self_data, other_data) if not reflexive else f(other_data, self_data)
   2495     )
   2496 result = Variable(dims, new_data, attrs=attrs)
   2497 return result

TypeError: unsupported operand type(s) for *: 'float' and 'cftime._cftime.DatetimeGregorian'
```


### Anything else we need to know?

I also noticed that since the Horner PR, `polyfit` and `polyval` do not use the same function to convert coordinates into numerical values. Isn't this dangerous?

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:38:57) [GCC 10.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1160.49.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: ('en_CA', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.8.1

xarray: 2022.3.1.dev267+gd711d58
pandas: 1.4.2
numpy: 1.21.6
scipy: 1.8.0
netCDF4: 1.5.8
pydap: None
h5netcdf: None
h5py: 3.6.0
Nio: None
zarr: 2.11.3
cftime: 1.6.0
nc_time_axis: 1.4.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.4
dask: 2022.04.1
distributed: 2022.4.1
matplotlib: 3.5.1
cartopy: 0.20.2
seaborn: None
numbagg: None
fsspec: 2022.3.0
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: 0.5.0
numpy_groupies: 0.9.15
setuptools: 62.1.0
pip: 22.0.4
conda: None
pytest: None
IPython: 8.2.0
sphinx: 4.5.0

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6623/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1237587122,I_kwDOAMm_X85JxBSy,6615,Flox grouping does not cast bool to int in summation,20629530,closed,0,,,0,2022-05-16T19:06:45Z,2022-05-17T02:24:32Z,2022-05-17T02:24:32Z,CONTRIBUTOR,,,,"### What happened?

In my codes I used the implicit cast from bool to int that xarray/numpy perform for certain operations. This is the case for `sum`. A resampling sum on a boolean array actually returns the number of True values and not the OR of all values.

However, when flox is activated, it does return the OR of all values. Digging a bit, I see that the flox aggregation uses `np.add` and not `np.sum`. So, this may in fact be an issue for flox?   It felt the xarray devs should know about this potential regression anyway. 

### What did you expect to happen?

I expected a sum of boolean to actually be the count of True values.

### Minimal Complete Verifiable Example

```Python
import xarray as xr 

ds = xr.tutorial.open_dataset(""air_temperature"")

# Count the monthly number of 6-hour periods with tas over 300K
with xr.set_options(use_flox=False):
    # this works as expected
    outOLD = (ds.air > 300).resample(time='MS').sum()

with xr.set_options(use_flox=True):
    # this doesn't fail, but return True or False :
    #  the OR and not the expected sum.
    outFLOX = (ds.air > 300).resample(time='MS').sum()
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

_No response_

### Anything else we need to know?

I wrote a quick test for basic operations and `sum` seems the only really problematic one. `prod` does return a different dtype, but the values are not impacted.

```
for op in ['any', 'all', 'count', 'sum', 'prod', 'mean', 'var', 'std', 'max', 'min']:
    with xr.set_options(use_flox=False):
        outO = getattr((ds.air > 300).resample(time='YS'), op)()
    with xr.set_options(use_flox=True):
        outF = getattr((ds.air > 300).resample(time='YS'), op)()
    print(op, outO.dtype, outF.dtype, outO.equals(outF)))
```
returns
```
any bool bool True
all bool bool True
count int64 int64 True
sum int64 bool False
prod int64 bool True
mean float64 float64 True
var float64 float64 True
std float64 float64 True
max bool bool True
min bool bool True
```

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.17.5-arch1-2
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 2022.3.1.dev16+g3ead17ea
pandas: 1.4.2
numpy: 1.21.6
scipy: 1.7.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: 1.3.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2022.04.1
distributed: 2022.4.1
matplotlib: 3.4.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.07.0
cupy: None
pint: 0.18
sparse: None
flox: 0.5.1
numpy_groupies: 0.9.16
setuptools: 57.4.0
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 8.2.0
sphinx: 4.1.2


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6615/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1175454678,I_kwDOAMm_X85GEAPW,6393, DataArray groupby returning Dataset broken in some cases,20629530,closed,0,,,1,2022-03-21T14:17:25Z,2022-03-21T15:26:20Z,2022-03-21T15:26:20Z,CONTRIBUTOR,,,,"### What happened?

This is a the reverse problem of #6379, the `DataArrayGroupBy._combine` method seems broken when the mapped function returns a Dataset (which worked before #5692).


### What did you expect to happen?

_No response_

### Minimal Complete Verifiable Example

```Python
import xarray as xr

ds = xr.tutorial.open_dataset(""air_temperature"")

ds.air.resample(time=""YS"").map(lambda grp: grp.mean(""time"").to_dataset())
```


### Relevant log output

```Python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <module>
----> 1 ds.air.resample(time=""YS"").map(lambda grp: grp.mean(""time"").to_dataset())

File ~/Python/myxarray/xarray/core/resample.py:223, in DataArrayResample.map(self, func, shortcut, args, **kwargs)
    180 """"""Apply a function to each array in the group and concatenate them
    181 together into a new array.
    182 
   (...)
    219     The result of splitting, applying and combining this array.
    220 """"""
    221 # TODO: the argument order for Resample doesn't match that for its parent,
    222 # GroupBy
--> 223 combined = super().map(func, shortcut=shortcut, args=args, **kwargs)
    225 # If the aggregation function didn't drop the original resampling
    226 # dimension, then we need to do so before we can rename the proxy
    227 # dimension we used.
    228 if self._dim in combined.coords:

File ~/Python/myxarray/xarray/core/groupby.py:835, in DataArrayGroupByBase.map(self, func, shortcut, args, **kwargs)
    833 grouped = self._iter_grouped_shortcut() if shortcut else self._iter_grouped()
    834 applied = (maybe_wrap_array(arr, func(arr, *args, **kwargs)) for arr in grouped)
--> 835 return self._combine(applied, shortcut=shortcut)

File ~/Python/myxarray/xarray/core/groupby.py:869, in DataArrayGroupByBase._combine(self, applied, shortcut)
    867     index, index_vars = create_default_index_implicit(coord)
    868     indexes = {k: index for k in index_vars}
--> 869     combined = combined._overwrite_indexes(indexes, coords=index_vars)
    870 combined = self._maybe_restore_empty_groups(combined)
    871 combined = self._maybe_unstack(combined)

TypeError: _overwrite_indexes() got an unexpected keyword argument 'coords'
```


### Anything else we need to know?

I guess the same solution as #6386 could be used!

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.16.13-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 2022.3.1.dev16+g3ead17ea
pandas: 1.4.0
numpy: 1.20.3
scipy: 1.7.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: 1.3.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.08.0
distributed: 2021.08.0
matplotlib: 3.4.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.07.0
cupy: None
pint: 0.18
sparse: None
setuptools: 57.4.0
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 8.0.1
sphinx: 4.1.2

</details>","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6393/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1173980959,I_kwDOAMm_X85F-Ycf,6379,Dataset groupby returning DataArray broken in some cases,20629530,closed,0,,,1,2022-03-18T20:07:37Z,2022-03-20T18:55:26Z,2022-03-20T18:55:26Z,CONTRIBUTOR,,,,"### What happened?

Got a TypeError when resampling a dataset along a dimension, mapping a function to each group. The function returns a DataArray.

Failed with : `TypeError: _overwrite_indexes() got an unexpected keyword argument 'variables' `

### What did you expect to happen?

This worked before the merging of #5692. A DataArray was returned as expected.

### Minimal Complete Verifiable Example

```Python
import xarray as xr

ds = xr.tutorial.open_dataset(""air_temperature"")

ds.resample(time=""YS"").map(lambda grp: grp.air.mean(""time""))
```


### Relevant log output

```Python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [37], in <module>
----> 1 ds.resample(time=""YS"").map(lambda grp: grp.air.mean(""time""))

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/xarray/core/resample.py:300, in DatasetResample.map(self, func, args, shortcut, **kwargs)
    298 # ignore shortcut if set (for now)
    299 applied = (func(ds, *args, **kwargs) for ds in self._iter_grouped())
--> 300 combined = self._combine(applied)
    302 return combined.rename({self._resample_dim: self._dim})

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/xarray/core/groupby.py:999, in DatasetGroupByBase._combine(self, applied)
    997     index, index_vars = create_default_index_implicit(coord)
    998     indexes = {k: index for k in index_vars}
--> 999     combined = combined._overwrite_indexes(indexes, variables=index_vars)
   1000 combined = self._maybe_restore_empty_groups(combined)
   1001 combined = self._maybe_unstack(combined)

TypeError: _overwrite_indexes() got an unexpected keyword argument 'variables'
```

### Anything else we need to know?

In the docstring of `DatasetGroupBy.map` it is not made clear that the passed function should return a dataset, but the opposite is also not said. This worked before and I think the issues comes from #5692, which introduced different signatures for `DataArray._overwrite_indexes` (which is called in my case) and `Dataset._overwrite_indexes` (which is expected by the new `_combine`).

If the function passed to `Dataset.resample(...).map` should only return `Dataset`s then I believe a more explicit error is needed, as well as some notice in the docs and a breaking change entry in the changelog. If `DataArray`s should be accepted, then we have a regression here.

I may have time to help on this.

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.16.13-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 2022.3.1.dev16+g3ead17ea
pandas: 1.4.0
numpy: 1.20.3
scipy: 1.7.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: 1.3.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.08.0
distributed: 2021.08.0
matplotlib: 3.4.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.07.0
cupy: None
pint: 0.18
sparse: None
setuptools: 57.4.0
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 8.0.1
sphinx: 4.1.2

</details>","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6379/reactions"", ""total_count"": 2, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,completed,13221727,issue
1173997225,I_kwDOAMm_X85F-cap,6380,Attributes of concatenation coordinate are dropped,20629530,closed,0,,,1,2022-03-18T20:31:17Z,2022-03-20T18:53:46Z,2022-03-20T18:53:46Z,CONTRIBUTOR,,,,"### What happened?

When concatenating two objects with `xr.concat` along a new dimension given through a `DataArray`, the attributes of this given coordinate are lost in the concatenation.

### What did you expect to happen?

I expected the concatenation coordinate to be identical to the 1D DataArray I gave to `concat`.

### Minimal Complete Verifiable Example

```Python
import xarray as xr

ds = xr.tutorial.open_dataset(""air_temperature"")

concat_dim = xr.DataArray([1, 2], dims=(""condim"",), attrs={""an_attr"": ""yep""}, name=""condim"")

out = xr.concat([ds, ds], concat_dim)
out.condim.attrs
```

Before #5692, I get:
```
{'an_attr': 'yep'}
```
with the current master, I get:
```
{}
```

### Anything else we need to know?

I'm not 100% sure, but I think the change is due to `xr.core.concat._calc_concat_dim_coord` being replaced by `xr.core.concat.__calc_concat_dim_index`. The former didn't touch the concatenation coordinate, while the latter casts it as an index, thus dropping the attributes in the process.

If the solution is to add a check in `xr.concat`, I may have time to implement something simple.

### Environment

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.16.13-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.12.0
libnetcdf: 4.7.4

xarray: 2022.3.1.dev16+g3ead17ea
pandas: 1.4.0
numpy: 1.20.3
scipy: 1.7.1
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.4.0
Nio: None
zarr: 2.10.0
cftime: 1.5.0
nc_time_axis: 1.3.1
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2021.08.0
distributed: 2021.08.0
matplotlib: 3.4.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2021.07.0
cupy: None
pint: 0.18
sparse: None
setuptools: 57.4.0
pip: 21.2.4
conda: None
pytest: 6.2.5
IPython: 8.0.1
sphinx: 4.1.2

</details>","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6380/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
870312451,MDExOlB1bGxSZXF1ZXN0NjI1NTMwMDQ2,5233,Calendar utilities,20629530,closed,0,,,16,2021-04-28T20:01:33Z,2021-12-30T22:54:49Z,2021-12-30T22:54:11Z,CONTRIBUTOR,,0,pydata/xarray/pulls/5233,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Closes #5155 
- [x] Tests added
- [x] Passes `pre-commit run --all-files`
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [x] New functions/methods are listed in `api.rst`

So:

- Added `coding.cftime_offsets.date_range` and `coding.cftime_offsets.date_range_like`
  The first simply swtiches between `pd.date_range` and `xarray.cftime_range` according to the arguments. The second infers start, end and freq from an existing datetime array and returns a similar range in another calendar.

- Added `coding/calendar_ops.py` with `convert_calendar` and `interp_calendar`
  Didn't know where to put them, so there they are.

- Added `DataArray.dt.calendar`.
  When the datetime objects are backed by numpy, it always return `""proleptic_gregorian""`.

I'm not sure where to expose the function. Should the range-generators be accessible directly like `xr.date_range`?

The `convert_calendar` and `interp_calendar` could be implemented as methods of `DataArray` and   `Dataset`,  should I do that?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5233/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
857947050,MDU6SXNzdWU4NTc5NDcwNTA=,5155,Calendar utilities,20629530,closed,0,,,9,2021-04-14T14:18:48Z,2021-12-30T22:54:11Z,2021-12-30T22:54:11Z,CONTRIBUTOR,,,,"**Is your feature request related to a problem? Please describe.**
Handling cftime and numpy time coordinates can sometimes be exhausting. Here I am thinking of the following common problems:

1. Querying the calendar type from a time coordinate.
2. Converting a _dataset_ from a calendar type to another.
3. Generating a time coordinate in the correct calendar. 

**Describe the solution you'd like**

1. `ds.time.dt.calendar` would be magic.
2.  `xr.convert_calendar(ds, ""new_cal"")` could be nice?
3. `xr.date_range(start, stop, calendar=cal)`, same as pandas' (see context below).

**Describe alternatives you've considered**
We have implemented all this in (xclim)[https://xclim.readthedocs.io/en/stable/api.html#calendar-handling-utilities] (and more). But it seems to make sense that some of the simplest things there could move to xarray? We had this discussion in xarray-contrib/cf-xarray#193  and suggestion was made to see what fits here before implementing this there.

**Additional context**
At xclim, to differentiate numpy datetime64 from cftime types, we call the former ""default"". This way a time coordinate using cftime's ""proleptic_gregorian"" calendar is distinct from one using numpy's datetime64.

1. is easy ([xclim function](https://xclim.readthedocs.io/en/stable/api.html#xclim.core.calendar.get_calendar)). If the datatype is numpy return ""default"", if cftime, look into the first non-null value and get the calendar.
2. [xclim function](https://xclim.readthedocs.io/en/stable/api.html#xclim.core.calendar.convert_calendar) The calendar type of each time element is transformed to the new calendar. Our way is to _drop_ any dates that do not exist in the new calendar (like Feb 29th when going to ""noleap""). In the other direction, there is an option to either fill with some fill value of simply _not_ include them. It can't be a DataArray method, but could be a Dataset one, or simply a top-level function.  Related to #5107.

We also have an [`interp_calendar`](https://xclim.readthedocs.io/en/stable/api.html#xclim.core.calendar.interp_calendar) function that reinterps data on a yearly basis. This is a bit narrower, because it only makes sense on daily data (or coarser).

3. With the definition of a ""default"" calendar, [`date_range`](https://xclim.readthedocs.io/en/stable/api.html#xclim.core.calendar.date_range) and `date_range_like` simply chose between `pd.date_range` and `xr.cftime_range` according to the target calendar.


What do you think? I have time to move whatever code makes sense to move.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5155/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
895842334,MDU6SXNzdWU4OTU4NDIzMzQ=,5346,Fast-track unstack doesn't work with dask,20629530,closed,0,,,6,2021-05-19T20:14:26Z,2021-05-26T07:07:17Z,2021-05-26T07:07:17Z,CONTRIBUTOR,,,,"<!-- Please include a self-contained copy-pastable example that generates the issue if possible.

Please be concise with code posted. See guidelines below on how to provide a good bug report:

- Craft Minimal Bug Reports: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
- Minimal Complete Verifiable Examples: https://stackoverflow.com/help/mcve

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

**What happened**:
Using `unstack` on data with the dask backend fails with a dask error.

**What you expected to happen**:
No failure, as with xarray 0.18.0 and earlier.

**Minimal Complete Verifiable Example**:

```python
import pandas as pd
import xarray as xr

da = xr.DataArray([1] * 4, dims=('x',), coords={'x': [1, 2, 3, 4]})
dac = da.chunk()

ind = pd.MultiIndex.from_arrays(([0, 0, 1, 1], [0, 1, 0, 1]), names=(""y"", ""z""))
dac.assign_coords(x=ind).unstack(""x"")"")
```
Fails with:
```python
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-4-3c317738ec05> in <module>
      3 
      4 ind = pd.MultiIndex.from_arrays(([0, 0, 1, 1], [0, 1, 0, 1]), names=(""y"", ""z""))
----> 5 dac.assign_coords(x=ind).unstack(""x"")

~/Python/myxarray/xarray/core/dataarray.py in unstack(self, dim, fill_value, sparse)
   2133         DataArray.stack
   2134         """"""
-> 2135         ds = self._to_temp_dataset().unstack(dim, fill_value, sparse)
   2136         return self._from_temp_dataset(ds)
   2137 

~/Python/myxarray/xarray/core/dataset.py in unstack(self, dim, fill_value, sparse)
   4038             ):
   4039                 # Fast unstacking path:
-> 4040                 result = result._unstack_once(dim, fill_value)
   4041             else:
   4042                 # Slower unstacking path, examples of array types that

~/Python/myxarray/xarray/core/dataset.py in _unstack_once(self, dim, fill_value)
   3914                         fill_value_ = fill_value
   3915 
-> 3916                     variables[name] = var._unstack_once(
   3917                         index=index, dim=dim, fill_value=fill_value_
   3918                     )

~/Python/myxarray/xarray/core/variable.py in _unstack_once(self, index, dim, fill_value)
   1605         # sparse doesn't support item assigment,
   1606         # https://github.com/pydata/sparse/issues/114
-> 1607         data[(..., *indexer)] = reordered
   1608 
   1609         return self._replace(dims=new_dims, data=data)

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/core.py in __setitem__(self, key, value)
   1693 
   1694         out = ""setitem-"" + tokenize(self, key, value)
-> 1695         dsk = setitem_array(out, self, key, value)
   1696 
   1697         graph = HighLevelGraph.from_collections(out, dsk, dependencies=[self])

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/slicing.py in setitem_array(out_name, array, indices, value)
   1787 
   1788     # Reformat input indices
-> 1789     indices, indices_shape, reverse = parse_assignment_indices(indices, array_shape)
   1790 
   1791     # Empty slices can only be assigned size 1 values

~/.conda/envs/xxx/lib/python3.8/site-packages/dask/array/slicing.py in parse_assignment_indices(indices, shape)
   1476             n_lists += 1
   1477             if n_lists > 1:
-> 1478                 raise NotImplementedError(
   1479                     ""dask is currently limited to at most one ""
   1480                     ""dimension's assignment index being a ""

NotImplementedError: dask is currently limited to at most one dimension's assignment index being a 1-d array of integers or booleans. Got: (Ellipsis, array([0, 0, 1, 1], dtype=int8), array([0, 1, 0, 1], dtype=int8))
```
The example works when I go back to xarray 0.18.0.

**Anything else we need to know?**:
I saw no tests in ""test_daraarray.py"" and ""test_dataset.py"" for unstack+dask, but they might be elsewhere?
If #5315 was successful, maybe there is something specific in my example and config that is causing the error? @max-sixty @Illviljan 

Proposed test, for ""test_dataset.py"", adapted copy of `test_unstack`:
```python
    @requires_dask
    def test_unstack_dask(self):
        index = pd.MultiIndex.from_product([[0, 1], [""a"", ""b""]], names=[""x"", ""y""])
        ds = Dataset({""b"": (""z"", [0, 1, 2, 3]), ""z"": index}).chunk()
        expected = Dataset(
            {""b"": ((""x"", ""y""), [[0, 1], [2, 3]]), ""x"": [0, 1], ""y"": [""a"", ""b""]}
        )
        for dim in [""z"", [""z""], None]:
            actual = ds.unstack(dim).load()
            assert_identical(actual, expected)
```

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->

INSTALLED VERSIONS
------------------
commit: None
python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.11.16-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.18.2.dev2+g6d2a7301
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.3
netCDF4: 1.5.6
pydap: installed
h5netcdf: 0.11.0
h5py: 3.2.1
Nio: None
zarr: 2.8.1
cftime: 1.4.1
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.2.2
cfgrib: 0.9.9.0
iris: 2.4.0
bottleneck: 1.3.2
dask: 2021.05.0
distributed: 2021.05.0
matplotlib: 3.4.1
cartopy: 0.19.0
seaborn: 0.11.1
numbagg: installed
pint: 0.17
setuptools: 49.6.0.post20210108
pip: 21.1
conda: None
pytest: 6.2.3
IPython: 7.22.0
sphinx: 3.5.4


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5346/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
893692903,MDU6SXNzdWU4OTM2OTI5MDM=,5326,map_blocks doesn't handle tranposed arrays,20629530,closed,0,,,7,2021-05-17T20:34:58Z,2021-05-18T14:14:37Z,2021-05-18T14:14:37Z,CONTRIBUTOR,,,,"<!-- Please include a self-contained copy-pastable example that generates the issue if possible.

Please be concise with code posted. See guidelines below on how to provide a good bug report:

- Craft Minimal Bug Reports: http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports
- Minimal Complete Verifiable Examples: https://stackoverflow.com/help/mcve

Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly.
-->

**What happened**:

I was using `map_blocks` for a complex function which returns an array with a different dimension order than the input. Because of the complexity of the wrapped func, I need to generate a `template` first.

When calling `map_blocks` and loading the result, it passes all checks in `map_blocks` but `Variable` fails when assigning the new data.

**What you expected to happen**:
I expected no failure. Either the result  would have transposed dimensions, or it would have been transposed back to fit with `template`.

**Minimal Complete Verifiable Example**:

```python
import xarray as xr

da = xr.DataArray([[0, 1, 2], [3, 4, 5]], dims=('x', 'y'))

def func(d):
    return d.transpose()

dac = da.chunk()
dac.map_blocks(func, template=dac).load()
```

Traceback:
```python
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-1-0da1b18a36a8> in <module>
      7 
      8 dac = da.chunk()
----> 9 dac.map_blocks(func, template=dac).load()

~/.conda/envs/xclim/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs)
    871         dask.compute
    872         """"""
--> 873         ds = self._to_temp_dataset().load(**kwargs)
    874         new = self._from_temp_dataset(ds)
    875         self._variable = new._variable

~/.conda/envs/xclim/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs)
    799 
    800             for k, data in zip(lazy_data, evaluated_data):
--> 801                 self.variables[k].data = data
    802 
    803         # load everything else sequentially

~/.conda/envs/xclim/lib/python3.8/site-packages/xarray/core/variable.py in data(self, data)
    378         data = as_compatible_data(data)
    379         if data.shape != self.shape:
--> 380             raise ValueError(
    381                 f""replacement data must match the Variable's shape. ""
    382                 f""replacement data has shape {data.shape}; Variable has shape {self.shape}""

ValueError: replacement data must match the Variable's shape. replacement data has shape (3, 2); Variable has shape (2, 3)
```
If `func` is made to return `d` (no transpose), the code works.

I actually not sure which behaviour would be the best : a result with transposed dimensions to fit with the wrapped func or to tranpose the result to fit with the template. The latter seems much easier to implement by editing `core/parallel.py` and add the transposition at the end of `_wrapper()` in `map_blocks()`.

**Environment**:

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->

INSTALLED VERSIONS
------------------
commit: None
python: 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) 
[GCC 9.3.0]
python-bits: 64
OS: Linux
OS-release: 5.11.16-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.17.1.dev99+gc58e2aeb.d20210430
pandas: 1.2.4
numpy: 1.20.2
scipy: 1.6.3
netCDF4: 1.5.6
pydap: installed
h5netcdf: 0.11.0
h5py: 3.2.1
Nio: None
zarr: 2.8.1
cftime: 1.4.1
nc_time_axis: 1.2.0
PseudoNetCDF: installed
rasterio: 1.2.2
cfgrib: 0.9.9.0
iris: 2.4.0
bottleneck: 1.3.2
dask: 2021.04.0
distributed: 2021.04.1
matplotlib: 3.4.1
cartopy: 0.19.0
seaborn: 0.11.1
numbagg: installed
pint: 0.17
setuptools: 49.6.0.post20210108
pip: 21.1
conda: None
pytest: 6.2.3
IPython: 7.22.0
sphinx: 3.5.4

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5326/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
824917345,MDU6SXNzdWU4MjQ5MTczNDU=,5010,DataArrays inside apply_ufunc with dask=parallelized,20629530,closed,0,,,3,2021-03-08T20:19:41Z,2021-03-08T20:37:15Z,2021-03-08T20:35:01Z,CONTRIBUTOR,,,,"<!-- Please do a quick search of existing issues to make sure that this has not been asked before. -->

**Is your feature request related to a problem? Please describe.**
Currently, when using apply_ufunc with `dask=parallelized` the wrapped function receives numpy arrays upon computation.  
Some xarray operations generate enormous amount of chunks (best example : `da.groupby('time.dayofyear')`, so any complex script using dask ends up with huge task graphs. Dask's scheduler becomes overloaded,  sometimes even hangs, sometimes uses way more RAM than its workers.

**Describe the solution you'd like**
I'd want to profit from both the tools of xarray and the power of dask parallelization. I'd like to be able to do something like this:

```python3
def func(da):
     """"""Example of an operation not (easily) possible with numpy.""""""
     return da.groupby('time').mean()

xr.apply_ufunc(
    da,
    func,
    input_core_dims=[['time']],
    pass_xr=True,
    dask='parallelized'
)
```
I'd like the wrapped func to receive DataArrays resembling the inputs (named dims, coords and all), but only with the subset of that dask chunk.  Doing this, the whole function gets parallelized : dask only sees 1 task and I can code using xarray. Depending on the implementation, it might be less efficient than `dask=allowed` for small dataset, but I think this could be beneficial for long and complex computations on large datasets.

**Describe alternatives you've considered**
The alternative is to reduce the size of the datasets (looping on other dimensions), but that defeats the purpose of dask.

Another alternative I am currently testing, is to add a layer between apply_ufunc and the `func`. That layer reconstruct a DataArray and deconstructs it before returning the result, so xarray/dask only passing by. If this works and is elegant enough, I can maybe suggest an implementation within xarray.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5010/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
625123449,MDU6SXNzdWU2MjUxMjM0NDk=,4097,CFTime offsets missing for milli- and micro-seconds,20629530,closed,0,,,0,2020-05-26T19:13:37Z,2021-02-10T21:44:26Z,2021-02-10T21:44:25Z,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->
The smallest cftime offset defined in `xarray.coding.cftime_offsets.py` is ""second"" (S), but the precision of cftime objects goes down to the millisecond  (L) and microsecond (U). They should be easily added.

PR #4033 adds a `xr.infer_freq` that supports the two, but they are currently untested as `xr.cftime_range` cannot generate an index.

#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a ""Minimal, Complete and Verifiable Example"" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
xr.cftime_range(""2000-01-01"", periods=3, freq='10L')
```

#### Expected Output
```
CFTimeIndex([2000-01-01 00:00:00, 2000-01-01 00:00:00.010000,
             2000-01-01 00:00:00.020000],
            dtype='object')
```

#### Problem Description
<!-- this should explain why the current behavior is a problem and why the expected output is a better solution -->
An error gets raised : `ValueError: Invalid frequency string provided `.

#### Versions

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.6.13-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: fr_CA.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.15.2.dev9+g6378a711.d20200505
pandas: 1.0.3
numpy: 1.18.4
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.1.1.2
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.16.0
distributed: 2.16.0
matplotlib: 3.2.1
cartopy: None
seaborn: None
numbagg: None
pint: 0.11
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: 3.0.2

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4097/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
617704483,MDU6SXNzdWU2MTc3MDQ0ODM=,4058,Interpolating with a common chunked dim fails,20629530,closed,0,,,2,2020-05-13T19:38:48Z,2020-09-24T16:50:11Z,2020-09-24T16:50:10Z,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->
Interpolating a dataarray with another one fails if one of them is a dask array and they share a chunked dimension. Even if the interpolation is independent of that dimension.

#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a ""Minimal, Complete and Verifiable Example"" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
import xarray as xr
import numpy as np

g = xr.DataArray(np.zeros((10, 10)), dims=('x', 'c'), coords={k: np.arange(10) for k in ['x', 'c']})
b = xr.DataArray([5, 6.6, 8.8], dims=('new',)).expand_dims(c=g.c)
gc = g.chunk({'c': 1})
gc.interp(x=b)
```

#### Expected Output
An array with coords ""new"" and ""c"", with values of `g` interpolated along `x` at positions in `b`, for each `c`. As there is no interpolation _along_ `c`, I would expect the fact that it is chunked to be irrelevant.

#### Problem Description
<!-- this should explain why the current behavior is a problem and why the expected output is a better solution -->
Raises: `NotImplementedError: Chunking along the dimension to be interpolated (1) is not yet supported.;`

I didn't see any issue about this, so I thought it ought to be noted as a needed enhancement.

#### Versions

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 16 2020, 18:04:51) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-514.2.2.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.15.2.dev42+g0cd14a5
pandas: 1.0.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.1.1.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.3
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.14.0
distributed: 2.14.0
matplotlib: 3.2.1
cartopy: 0.17.0
seaborn: None
numbagg: None
pint: 0.11
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: None

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4058/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
650044968,MDExOlB1bGxSZXF1ZXN0NDQzNjEwOTI2,4193,Fix polyfit fail on deficient rank,20629530,closed,0,,,5,2020-07-02T16:00:21Z,2020-08-20T14:20:43Z,2020-08-20T08:34:45Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4193,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4190
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
 - [ ] New functions/methods are listed in `api.rst`

Fixes #4190. In cases where the input matrix had a deficient rank (matrix rank != order) because of the number of NaN values, polyfit would fail, simply because numpy's lstsq returned an empty array for the residuals (instead of a size 1 array). This fixes the problem by catching the case and returning `np.nan` instead.

The other point in the issue was that `RankWarning` is also not raised in that case. That was due to the fact that `da.polyfit` was computing the rank from the coordinate (Vandermonde) matrix, instead of the masked data. Thus, is a given line has too many NaN values, its deficient rank was not detected. I added a test and warning at all places where a rank is computed (5 different lines). Also, to match np.polyfit behaviour of no warning when `full=True`, I changed the warning filters using a context manager, ignoring the `RankWarning` in that case. Overall, it feels a bi ugly because of the duplicated code and it will print the warning for every line of an array that has a deficient rank, which can be a lot... ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4193/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
635542241,MDExOlB1bGxSZXF1ZXN0NDMxODg5NjQ0,4135,Correct dask handling for 1D idxmax/min on ND data,20629530,closed,0,,,1,2020-06-09T15:36:09Z,2020-06-25T16:09:59Z,2020-06-25T03:59:52Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4135,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4123
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Based on comments on dask/dask#3096, I fixed the dask indexing error that occurred when `idxmax/idxmin` were called on ND data (where N > 2). Added tests are very simplistic, I believe the 1D and 2D tests already cover most cases, I just wanted to test that is was indeed working on ND data, assuming that non-dask data was already treated properly.

I believe this doesn't conflict with #3936.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4135/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
631681216,MDU6SXNzdWU2MzE2ODEyMTY=,4123,idxmax/idxmin not working with dask arrays of more than 2 dims.,20629530,closed,0,,,0,2020-06-05T15:19:41Z,2020-06-25T03:59:52Z,2020-06-25T03:59:51Z,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->

In opposition to `argmin/argmax`, `idxmax/idxmin` fails on DataArrays of more than 2 dimensions, when the data is stored in dask arrays.

#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a ""Minimal, Complete and Verifiable Example"" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
# Your code here
import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature').resample(time='D').mean()
dsc = ds.chunk({'time':-1, 'lat': 5, 'lon': 5})
dsc.air.argmax('time').values  # Works (I added .values to be sure all computation is done)
dsc.air.idxmin('time') # Fails
```

#### Expected Output
Something like:
```
<xarray.DataArray 'time' (lat: 25, lon: 53)>
dask.array<where, shape=(25, 53), dtype=datetime64[ns], chunksize=(5, 5), chunktype=numpy.ndarray>
Coordinates:
  * lon      (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
  * lat      (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0
```

#### Problem Description
<!-- this should explain why the current behavior is a problem and why the expected output is a better solution -->
Throws an error:
```
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-11-0b9bf50bc3ab> in <module>
      3 dsc = ds.chunk({'time':-1, 'lat': 5, 'lon': 5})
      4 dsc.air.argmax('time').values
----> 5 dsc.air.idxmin('time')

~/Python/myxarray/xarray/core/dataarray.py in idxmin(self, dim, skipna, fill_value, keep_attrs)
   3626           * y        (y) int64 -1 0 1
   3627         """"""
-> 3628         return computation._calc_idxminmax(
   3629             array=self,
   3630             func=lambda x, *args, **kwargs: x.argmin(*args, **kwargs),

~/Python/myxarray/xarray/core/computation.py in _calc_idxminmax(array, func, dim, skipna, fill_value, keep_attrs)
   1564         chunks = dict(zip(array.dims, array.chunks))
   1565         dask_coord = dask.array.from_array(array[dim].data, chunks=chunks[dim])
-> 1566         res = indx.copy(data=dask_coord[(indx.data,)])
   1567         # we need to attach back the dim name
   1568         res.name = dim

~/.conda/envs/xarray-xclim-dev/lib/python3.8/site-packages/dask/array/core.py in __getitem__(self, index)
   1539 
   1540         if any(isinstance(i, Array) and i.dtype.kind in ""iu"" for i in index2):
-> 1541             self, index2 = slice_with_int_dask_array(self, index2)
   1542         if any(isinstance(i, Array) and i.dtype == bool for i in index2):
   1543             self, index2 = slice_with_bool_dask_array(self, index2)

~/.conda/envs/xarray-xclim-dev/lib/python3.8/site-packages/dask/array/slicing.py in slice_with_int_dask_array(x, index)
    934                 out_index.append(slice(None))
    935             else:
--> 936                 raise NotImplementedError(
    937                     ""Slicing with dask.array of ints only permitted when ""
    938                     ""the indexer has zero or one dimensions""

NotImplementedError: Slicing with dask.array of ints only permitted when the indexer has zero or one dimensions
```

I saw  #3922 and thought this PR was aiming to make this work, so I'm a bit confused.

(I tested with dask 2.17.2 also and it still fails)

#### Versions

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.6.15-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: fr_CA.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.15.2.dev9+g6378a711.d20200505
pandas: 1.0.3
numpy: 1.18.4
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: None
cftime: 1.1.1.2
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.16.0
distributed: 2.17.0
matplotlib: 3.2.1
cartopy: None
seaborn: None
numbagg: None
pint: 0.12
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: 3.0.2

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4123/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
607616849,MDU6SXNzdWU2MDc2MTY4NDk=,4009,Incoherencies between docs in open_mfdataset and combine_by_coords and its behaviour.,20629530,closed,0,,,2,2020-04-27T14:55:33Z,2020-06-24T18:22:19Z,2020-06-24T18:22:19Z,CONTRIBUTOR,,,,"<!-- A short summary of the issue, if appropriate -->

PR #3877 adds nice control over the attrs of the ouput, but there are some incoherencies in the docs and the behaviour that break previously fine code.

#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a ""Minimal, Complete and Verifiable Example"" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
import xarray as xr
out = xr.open_mfdataset('/files/with/*_conflicting_attrs.nc', combine='by_coords')
```
#### Expected Output
`out` having the attributes from the first file in the sorted glob list.

#### Problem Description
<!-- this should explain why the current behavior is a problem and why the expected output is a better solution -->

Fails with a `MergeError` .

In the doc of  `open_mfdataset` it is said:
```
    attrs_file : str or pathlib.Path, optional
        Path of the file used to read global attributes from.
        By default global attributes are read from the first file provided,
        with wildcard matches sorted by filename.
```
But in the code, `open_mfdataset` calls `combine_by_coords` without specifying its `combine_attrs` argument, which defaults to 'no_conflicts', instead of the expected 'override' or 'drop'. The attributes are anyway managed by `open_mfdataset` further down, but in the case of conflicts the code never reaches that point.

Also, in the doc of `combine_by_coords` the wrong default is specified:
```
    combine_attrs : {'drop', 'identical', 'no_conflicts', 'override'},
                    default 'drop'
        String indicating how to combine attrs of the objects being merged:

        - 'drop': empty attrs on returned Dataset.
        - 'identical': all attrs must be the same on every object.
        - 'no_conflicts': attrs from all objects are combined, any that have
          the same name must also have the same value.
        - 'override': skip comparing and copy attrs from the first dataset to
          the result.
```

I think we expect either `combine_by_coords` to have 'drop' as the default or `open_mfdataset` to pass `combine_attrs='drop'`.

#### Versions

<details><summary>Output of <tt>xr.show_versions()</tt></summary>

<!-- Paste the output here xr.show_versions() here -->
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.2 | packaged by conda-forge | (default, Apr 24 2020, 08:20:52) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.6.7-arch1-1
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: fr_CA.utf8
LOCALE: fr_CA.UTF-8
libhdf5: 1.10.6
libnetcdf: 4.7.4

xarray: 0.15.2.dev29+g7eeba59f
pandas: 1.0.3
numpy: 1.18.1
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.1.1.2
nc_time_axis: 1.2.0
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.14.0
distributed: 2.14.0
matplotlib: 3.2.1
cartopy: None
seaborn: None
numbagg: None
setuptools: 46.1.3.post20200325
pip: 20.0.2
conda: None
pytest: 5.4.1
IPython: 7.13.0
sphinx: 3.0.2

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4009/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
625942676,MDExOlB1bGxSZXF1ZXN0NDI0MDQ4Mzg3,4099,Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit,20629530,closed,0,,,0,2020-05-27T18:48:58Z,2020-06-05T15:46:00Z,2020-06-05T15:46:00Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4099,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [ ] Closes #xxxx
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

 Pull #3733 added `da.polyfit` and `xr.polyval` and is using `xr.core.missing.get_clean_interp_index` in order to get the fitting coordinate. However, this method is stricter than what polyfit needs: as in `numpy.polyfit`, non-unique and non-monotonic indexes are acceptable. This PR adds a `strict` keyword argument to `get_clean_interp_index` so we can skip the uniqueness and monotony tests.

`ds.polyfit` and  `xr.polyval` were modified to use that keyword. I only added tests for  `get_clean_interp_index`, could add more for `polyfit` if requested.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4099/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
612846594,MDExOlB1bGxSZXF1ZXN0NDEzNzEzODg2,4033,xr.infer_freq,20629530,closed,0,,,3,2020-05-05T19:39:05Z,2020-05-30T18:11:36Z,2020-05-30T18:08:27Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4033,"<!-- Feel free to remove check-list items aren't relevant to your change -->
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

This PR adds a `xr.infer_freq`  method to copy pandas `infer_freq` but on `CFTimeIndex` objects. I tried to subclass pandas `_FrequencyInferer` and to only override as little as possible.

Two things are problematic right now and I would like to get feedback on how to implement them if this PR gets the dev's approval.

1) `pd.DatetimeIndex.asi8` returns integers representing _nanoseconds_ since 1970-1-1, while `xr.CFTimeIndex.asi8` returns _microseconds_. In order not to break the API, I patched the  `_CFTimeFrequencyInferer` to store 1000x the values. Not sure if this is the best, but it works.

2) As of now, `xr.infer_freq` will fail on weekly indexes. This is because pandas is using `datetime.weekday()` at some point but cftime objects do not implement that (they use `dayofwk` instead).  I'm not sure what to do?   Cftime could implement it to completly mirror python's datetime or pandas could use `dayofwk` since it's available on the `TimeStamp` objects.

Another option, cleaner but longer, would be to reimplement `_FrequencyInferer` from scratch. I may have time for this, cause I really think  a `xr.infer_freq` method would be useful.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4033/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
557627188,MDExOlB1bGxSZXF1ZXN0MzY5MTg0Mjk0,3733,Implementation of polyfit and polyval,20629530,closed,0,,,9,2020-01-30T16:58:51Z,2020-03-26T00:22:17Z,2020-03-25T17:17:45Z,CONTRIBUTOR,,0,pydata/xarray/pulls/3733," - [x] Closes #3349
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Following discussions in #3349, I suggest here an implementation of `polyfit` and `polyval` for xarray. However, this is still work in progress, a lot of testing is missing, all docstrings are missing. But, mainly, I have questions on how to properly conduct this.

My implementation mostly duplicates the code of `np.polyfit`, but making use of `dask.array.linalg.lstsq` and `dask.array.apply_along_axis` for dask arrays. The same method as in `xscale.signal.fitting.polyfit`, but I add NaN-awareness in a 1-D manner. 
The version with numpy is also slightly different of `np.polyfit` because of the NaN skipping, but I wanted the function to replicate its behaviour. It returns a variable number of DataArrays, depending on the keyword arguments (coefficients, [ residuals, matrix rank, singular values ] / [covariance matrix]).
Thus giving a medium-length function that has a lot of duplicated code from `numpy.polyfit`.  I thought of simply using a `xr.apply_ufunc`, but that makes chunking along the fitted dimension forbidden and difficult to return the ancillary results (residuals, rank, covariance matrix...).

Questions: 
1 ) Are the functions where they should go?
2 ) Should xarray's implementation really replicate the behaviour of numpy's? A lot of extra code could be removed if we'd say we only want to compute and return the residuals and the coefficients. All the other variables are a few lines of code away for the user that really wants them, and they don't need the power of xarray and dask anyway.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3733/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull