id,node_id,number,state,locked,title,user,body,created_at,updated_at,closed_at,merged_at,merge_commit_sha,assignee,milestone,draft,head,base,author_association,auto_merge,repo,url,merged_by
369184294,MDExOlB1bGxSZXF1ZXN0MzY5MTg0Mjk0,3733,closed,0,Implementation of polyfit and polyval,20629530," - [x] Closes #3349
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Following discussions in #3349, I suggest here an implementation of `polyfit` and `polyval` for xarray. However, this is still work in progress, a lot of testing is missing, all docstrings are missing. But, mainly, I have questions on how to properly conduct this.

My implementation mostly duplicates the code of `np.polyfit`, but making use of `dask.array.linalg.lstsq` and `dask.array.apply_along_axis` for dask arrays. The same method as in `xscale.signal.fitting.polyfit`, but I add NaN-awareness in a 1-D manner. 
The version with numpy is also slightly different of `np.polyfit` because of the NaN skipping, but I wanted the function to replicate its behaviour. It returns a variable number of DataArrays, depending on the keyword arguments (coefficients, [ residuals, matrix rank, singular values ] / [covariance matrix]).
Thus giving a medium-length function that has a lot of duplicated code from `numpy.polyfit`.  I thought of simply using a `xr.apply_ufunc`, but that makes chunking along the fitted dimension forbidden and difficult to return the ancillary results (residuals, rank, covariance matrix...).

Questions: 
1 ) Are the functions where they should go?
2 ) Should xarray's implementation really replicate the behaviour of numpy's? A lot of extra code could be removed if we'd say we only want to compute and return the residuals and the coefficients. All the other variables are a few lines of code away for the user that really wants them, and they don't need the power of xarray and dask anyway.",2020-01-30T16:58:51Z,2020-03-26T00:22:17Z,2020-03-25T17:17:45Z,2020-03-25T17:17:45Z,ec215daecec642db94102dc24156448f8440f52d,,,0,7eeba59ff487d5bc51809da4ae824e7283b5b2aa,009aa66620b3437cf0de675013fa7d1ff231963c,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/3733,
413713886,MDExOlB1bGxSZXF1ZXN0NDEzNzEzODg2,4033,closed,0,xr.infer_freq,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

This PR adds a `xr.infer_freq`  method to copy pandas `infer_freq` but on `CFTimeIndex` objects. I tried to subclass pandas `_FrequencyInferer` and to only override as little as possible.

Two things are problematic right now and I would like to get feedback on how to implement them if this PR gets the dev's approval.

1) `pd.DatetimeIndex.asi8` returns integers representing _nanoseconds_ since 1970-1-1, while `xr.CFTimeIndex.asi8` returns _microseconds_. In order not to break the API, I patched the  `_CFTimeFrequencyInferer` to store 1000x the values. Not sure if this is the best, but it works.

2) As of now, `xr.infer_freq` will fail on weekly indexes. This is because pandas is using `datetime.weekday()` at some point but cftime objects do not implement that (they use `dayofwk` instead).  I'm not sure what to do?   Cftime could implement it to completly mirror python's datetime or pandas could use `dayofwk` since it's available on the `TimeStamp` objects.

Another option, cleaner but longer, would be to reimplement `_FrequencyInferer` from scratch. I may have time for this, cause I really think  a `xr.infer_freq` method would be useful.",2020-05-05T19:39:05Z,2020-05-30T18:11:36Z,2020-05-30T18:08:27Z,2020-05-30T18:08:27Z,fd9e620a84389170138cc014ee5a0213718beb78,,,0,9a553edae8b2b4f52e5044d89b0f0354d51b003c,d1f7cb8fd95d588d3f7a7e90916c25747b90ad5a,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/4033,
424048387,MDExOlB1bGxSZXF1ZXN0NDI0MDQ4Mzg3,4099,closed,0,Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [ ] Closes #xxxx
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

 Pull #3733 added `da.polyfit` and `xr.polyval` and is using `xr.core.missing.get_clean_interp_index` in order to get the fitting coordinate. However, this method is stricter than what polyfit needs: as in `numpy.polyfit`, non-unique and non-monotonic indexes are acceptable. This PR adds a `strict` keyword argument to `get_clean_interp_index` so we can skip the uniqueness and monotony tests.

`ds.polyfit` and  `xr.polyval` were modified to use that keyword. I only added tests for  `get_clean_interp_index`, could add more for `polyfit` if requested.",2020-05-27T18:48:58Z,2020-06-05T15:46:00Z,2020-06-05T15:46:00Z,2020-06-05T15:46:00Z,09df5ca4036d84620373fa4bccd11d1f1d4bec28,,,0,fedfbf5ccdf52cac82ac0c072ae8882d630a2f51,e5cc19cd8f8a69e0743f230f5bf51b7778a0ff96,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/4099,
431889644,MDExOlB1bGxSZXF1ZXN0NDMxODg5NjQ0,4135,closed,0,Correct dask handling for 1D idxmax/min on ND data,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4123
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Based on comments on dask/dask#3096, I fixed the dask indexing error that occurred when `idxmax/idxmin` were called on ND data (where N > 2). Added tests are very simplistic, I believe the 1D and 2D tests already cover most cases, I just wanted to test that is was indeed working on ND data, assuming that non-dask data was already treated properly.

I believe this doesn't conflict with #3936.",2020-06-09T15:36:09Z,2020-06-25T16:09:59Z,2020-06-25T03:59:52Z,2020-06-25T03:59:51Z,f4638afe009fde5f53de1a1b80cc71f62593c463,,,0,76e82e90948aae14f170c595dc2ee61fdf1770cf,fb5fe79a2881055065cc2c0ed3f49f5448afdf32,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/4135,
443610926,MDExOlB1bGxSZXF1ZXN0NDQzNjEwOTI2,4193,closed,0,Fix polyfit fail on deficient rank,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4190
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
 - [ ] New functions/methods are listed in `api.rst`

Fixes #4190. In cases where the input matrix had a deficient rank (matrix rank != order) because of the number of NaN values, polyfit would fail, simply because numpy's lstsq returned an empty array for the residuals (instead of a size 1 array). This fixes the problem by catching the case and returning `np.nan` instead.

The other point in the issue was that `RankWarning` is also not raised in that case. That was due to the fact that `da.polyfit` was computing the rank from the coordinate (Vandermonde) matrix, instead of the masked data. Thus, is a given line has too many NaN values, its deficient rank was not detected. I added a test and warning at all places where a rank is computed (5 different lines). Also, to match np.polyfit behaviour of no warning when `full=True`, I changed the warning filters using a context manager, ignoring the `RankWarning` in that case. Overall, it feels a bi ugly because of the duplicated code and it will print the warning for every line of an array that has a deficient rank, which can be a lot... ",2020-07-02T16:00:21Z,2020-08-20T14:20:43Z,2020-08-20T08:34:45Z,2020-08-20T08:34:45Z,efabe74b1ce8f0666b93658ebb48104aa37b26ac,,,0,04be2e0fa1f96762798761f08aca7c37d7d8c67d,26547d19d477cc77461c09b3aadd55f7eb8b4dbf,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/4193,
625530046,MDExOlB1bGxSZXF1ZXN0NjI1NTMwMDQ2,5233,closed,0,Calendar utilities,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Closes #5155 
- [x] Tests added
- [x] Passes `pre-commit run --all-files`
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [x] New functions/methods are listed in `api.rst`

So:

- Added `coding.cftime_offsets.date_range` and `coding.cftime_offsets.date_range_like`
  The first simply swtiches between `pd.date_range` and `xarray.cftime_range` according to the arguments. The second infers start, end and freq from an existing datetime array and returns a similar range in another calendar.

- Added `coding/calendar_ops.py` with `convert_calendar` and `interp_calendar`
  Didn't know where to put them, so there they are.

- Added `DataArray.dt.calendar`.
  When the datetime objects are backed by numpy, it always return `""proleptic_gregorian""`.

I'm not sure where to expose the function. Should the range-generators be accessible directly like `xr.date_range`?

The `convert_calendar` and `interp_calendar` could be implemented as methods of `DataArray` and   `Dataset`,  should I do that?
",2021-04-28T20:01:33Z,2021-12-30T22:54:49Z,2021-12-30T22:54:11Z,2021-12-30T22:54:11Z,b14e2d8400da5c036f1ebb5486939f7f587b9f27,,,0,5aa747079ce32c51645ca245b1423cbacaf0cb1b,2694046c748a51125de6d460073635f1d789958e,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/5233,
657205536,MDExOlB1bGxSZXF1ZXN0NjU3MjA1NTM2,5402,open,0,`dt.to_pytimedelta`  to allow arithmetic with cftime objects,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [ ] Closes #xxxx
- [x] Tests added
- [x] Passes `pre-commit run --all-files`
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [ ] New functions/methods are listed in `api.rst`

When playing with cftime objects a problem I encountered many times is that I can sub two arrays and them add it back to another. Subtracting to cftime datetime arrays result in an array of `np.timedelta64`. And when trying to add it back to another cftime array, we get a `UFuncTypeError` because the two arrays have incompatible dtypes : '<m8[ns]' and 'O'.

Example:
```python
import xarray as xr
da = xr.DataArray(xr.cftime_range('1900-01-01', freq='D', periods=10), dims=('time',))

# An array of timedelta64[ns]
dt = da - da[0]

da[-1] + dt # Fails
```

However, if the two arrays were of 'O' dtype, then the subtraction would be made by `cftime` which supports `datetime.timedelta` objects. 

This solution here adds a `to_pytimedelta` to the `TimedeltaAccessor`, mirroring the name of the similar function on `pd.Series.dt`. It uses a monkeypatching workaround to prevent xarray to case the array back into numpy objects.

The user still has to check if the data is in cftime or numpy to adapt the operation (calling `dt.to_pytimedelta` or not), but custom workaround were always overly complicated for such a simple problem, this helps.

Also, this doesn't work with dask arrays because loading a dask array triggers the variable constructor and thus recasts the array of `datetime.timedelta` to `numpy.timedelta[64]`.

I realize I maybe should have opened an issue before, but I had this idea and it all rushed along.",2021-05-28T22:48:50Z,2022-06-09T14:50:16Z,,,0060277e4ecf1b05a198aeff9051d86f814b0096,,,0,71d567789573b47e059dbaebabcbda9c3493d0c5,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/5402,
729993114,MDExOlB1bGxSZXF1ZXN0NzI5OTkzMTE0,5781,open,0,Add encodings to save_mfdataset ,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [ ] Closes #xxxx
- [x] Tests added
- [x] Passes `pre-commit run --all-files`
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [ ] New functions/methods are listed in `api.rst`

Simply add a `encodings` argument to `save_mfdataset`. As for the other args, it expects a list of dictionaries, with encoding information to be passed to `to_netcdf` for each dataset. 
Added a minimal test, simply to see if the argument was taken into account.",2021-09-08T21:24:13Z,2022-10-06T21:44:18Z,,,d86b32087d7108dc866e34569653033973160827,,,0,23acbb84683f3dab9f593ee63a0323433b2b3638,d1e4164f3961d7bbb3eb79037e96cae14f7182f8,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/5781,
1673012286,PR_kwDOAMm_X85juCQ-,8603,closed,0,Convert 360_day calendars by choosing random dates to drop or add,20629530,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Tests added
- [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst`

Small PR to add a new ""method"" to convert to and from 360_day calendars. The current two methods (chosen with the `align_on` keyword) will always remove or add the same day-of-year for all years of the same length. 

This new option will randomly chose the days, one for each fifth of the year (72-days period). It emulates the method of the LOCA datasets (see [web page](https://loca.ucsd.edu/loca-calendar/) and [article](https://journals.ametsoc.org/view/journals/hydr/15/6/jhm-d-14-0082_1.xml) ). February 29th is always removed/added when the source/target is a leap year.

I copied the implementation from xclim (which I wrote), [see code here](https://github.com/Ouranosinc/xclim/blob/fb29b8a8e400c7d8aaf4e1233a6b37a300126257/xclim/core/calendar.py#L112-L134) .
",2024-01-10T19:13:31Z,2024-04-16T14:53:42Z,2024-04-16T14:53:42Z,2024-04-16T14:53:42Z,239309f881ba0d7e02280147bc443e6e286e6a63,,,0,b581e1f700382207c3bd0fd03860f44f33b29b79,b004af5174a4b0e32519df792a4f625d5548a9f0,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/8603,