id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2220689594,PR_kwDOAMm_X85rcmw1,8904,Handle extra indexes for zarr region writes,39069044,open,0,,,8,2024-04-02T14:34:00Z,2024-04-03T19:20:37Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/8904," - [x] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Small follow up to #8877. If we're going to drop the indices anyways for region writes, we may as well not raise if they are still in the dataset. This makes the user experience of region writes simpler: ```python ds = xr.tutorial.open_dataset(""air_temperature"") ds.to_zarr(""test.zarr"") region = {""time"": slice(0, 10)} # This fails unless we remember to ds.drop_vars([""lat"", ""lon""]) ds.isel(**region).to_zarr(""test.zarr"", region=region) ``` I find this annoying because I often have a dataset with a bunch of unrelated indexes and have to remember which ones to drop, or use some verbose `set` logic. I thought #8877 might have already done this, but not quite. By just reordering the point at which we drop indices, we can now skip this. We still raise if data vars are passed that don't overlap with the region. cc @dcherian ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8904/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1359368857,PR_kwDOAMm_X84-PSvu,6978,fix passing of curvefit kwargs,39069044,open,0,,,5,2022-09-01T20:26:01Z,2022-10-11T18:50:45Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/6978," - [x] Closes #6891 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6978/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1043746973,PR_kwDOAMm_X84uC1vs,5933,Reimplement `.polyfit()` with `apply_ufunc`,39069044,open,0,,,6,2021-11-03T15:29:58Z,2022-10-06T21:42:09Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/5933,"- [x] Closes #4554 - [x] Closes #5629 - [x] Closes #5644 - [ ] Tests added - [x] Passes `pre-commit run --all-files` - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Reimplement `polyfit` using `apply_ufunc` rather than `dask.array.linalg.lstsq`. This should solve a number of issues with memory usage and chunking that were reported on the current version of `polyfit`. The main downside is that variables chunked along the fitting dimension cannot be handled with this approach. There is a bunch of fiddly code here for handling the differing outputs from `np.polyfit` depending on the values of the `full` and `cov` args. Depending on the performance implications, we could simplify some by keeping these in `apply_ufunc` and dropping later. Much of this parsing would still be required though, because the only way to get the covariances is to set `cov=True, full=False`. A few minor departures from the previous implementation: 1. The `rank` and `singular_values` diagnostic variables returned by `np.polyfit` are now returned on a pointwise basis, since these can change depending on skipped nans. `np.polyfit` also returns the `rcond` used for each fit which I've included here. 2. As mentioned above, this breaks fitting done along a chunked dimension. To avoid regression, we could set `allow_rechunk=True` and warn about memory implications. 3. Changed default `skipna=True`, since the previous behavior seemed to be a limitation of the computational method. 4. For consistency with the previous version, I included a `transpose` operation to put `degree` as the first dimension. This is arbitrary though, and actually the opposite of how `curvefit` returns ordering. So we could match up with `curvefit` but it would be breaking for polyfit. No new tests have been added since the previous suite was fairly comprehensive. Would be great to get some performance reports on real-world data such as the climate model detrending application in #5629.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5933/reactions"", ""total_count"": 2, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull