id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2220689594,PR_kwDOAMm_X85rcmw1,8904,Handle extra indexes for zarr region writes,39069044,open,0,,,8,2024-04-02T14:34:00Z,2024-04-03T19:20:37Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/8904," - [x] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Small follow up to #8877. If we're going to drop the indices anyways for region writes, we may as well not raise if they are still in the dataset. This makes the user experience of region writes simpler: ```python ds = xr.tutorial.open_dataset(""air_temperature"") ds.to_zarr(""test.zarr"") region = {""time"": slice(0, 10)} # This fails unless we remember to ds.drop_vars([""lat"", ""lon""]) ds.isel(**region).to_zarr(""test.zarr"", region=region) ``` I find this annoying because I often have a dataset with a bunch of unrelated indexes and have to remember which ones to drop, or use some verbose `set` logic. I thought #8877 might have already done this, but not quite. By just reordering the point at which we drop indices, we can now skip this. We still raise if data vars are passed that don't overlap with the region. cc @dcherian ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8904/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2171912634,PR_kwDOAMm_X85o3Ify,8809,Pass variable name to `encode_zarr_variable`,39069044,closed,0,,,6,2024-03-06T16:21:53Z,2024-04-03T14:26:49Z,2024-04-03T14:26:48Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8809," - [x] Closes https://github.com/xarray-contrib/xeofs/issues/148 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` The change from https://github.com/pydata/xarray/pull/8672 mostly fixed the issue of serializing a reset multiindex in the backends, but there was an additional niche issue that turned up in xeofs that was causing serialization to still fail on the zarr backend. The issue is that zarr is the only backend that uses a custom version of `encode_cf_variable` called `encode_zarr_variable`, and the way this gets called we don't pass through the `name` of the variable before running `ensure_not_multiindex`. As a minimal fix, this PR just passes `name` through as an additional arg to the general `encode_variable` function. See @benbovy's [comment](https://github.com/pydata/xarray/pull/8672#issuecomment-1929837384) that maybe we should actually unwrap the level coordinate in `reset_index` and clean up the checks in `ensure_not_multiindex`, but I wasn't able to get that working easily. The exact workflow this turned up in involves DataTree and looks like this: ```python import numpy as np import xarray as xr from datatree import DataTree # ND DataArray that gets stacked along a multiindex da = xr.DataArray(np.ones((3, 3)), coords={""dim1"": [1, 2, 3], ""dim2"": [4, 5, 6]}) da = da.stack(feature=[""dim1"", ""dim2""]) # Extract just the stacked coordinates for saving in a dataset ds = xr.Dataset(data_vars={""feature"": da.feature}) # Reset the multiindex, which should make things serializable ds = ds.reset_index(""feature"") dt1 = DataTree() dt2 = DataTree(name=""feature"", data=ds) dt1[""foo""] = dt2 # Somehow in this step, dt1.foo.feature.dim1.variable becomes an IndexVariable again print(type(dt1.foo.feature.dim1.variable)) # Works dt1.to_netcdf(""test.nc"", mode=""w"") # Fails dt1.to_zarr(""test.zarr"", mode=""w"") ``` But we can reproduce in xarray with the test added here. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8809/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1985969769,PR_kwDOAMm_X85fDaBX,8434,Automatic region detection and transpose for `to_zarr()`,39069044,closed,0,,,15,2023-11-09T16:15:08Z,2023-11-14T18:34:50Z,2023-11-14T18:34:50Z,CONTRIBUTOR,,0,pydata/xarray/pulls/8434,"- [x] Closes #7702, #8421 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` A quick pass at implementing these two improvements for zarr region writes: 1. allow passing `region={dim: ""auto""}`, which opens the existing zarr store and identifies the correct slice to write to, using a variation of the approach suggested by @DahnJ [here](https://github.com/pydata/xarray/issues/7702#issuecomment-1669747481). We also check for non-matching coordinates and non-contiguous indices. 2. automatically transpose dimensions if they otherwise match the existing store but are out of order","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8434/reactions"", ""total_count"": 3, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 3, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1483235066,PR_kwDOAMm_X85Eti0b,7364,Handle numpy-only attrs in `xr.where`,39069044,closed,0,,,1,2022-12-08T00:52:43Z,2022-12-10T21:52:49Z,2022-12-10T21:52:37Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7364," - [x] Closes #7362 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7364/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1424732975,PR_kwDOAMm_X85Bnoaj,7229,"Fix coordinate attr handling in `xr.where(..., keep_attrs=True)`",39069044,closed,0,,,5,2022-10-26T21:45:01Z,2022-11-30T23:35:29Z,2022-11-30T23:35:29Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7229," - [x] Closes #7220 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Reverts the `getattr` method used in `xr.where(..., keep_attrs=True)` from #6461, but keeps handling for scalar inputs. Adds some test cases to ensure consistent attribute handling. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7229/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1198058137,PR_kwDOAMm_X8416DPB,6461,"Fix `xr.where(..., keep_attrs=True)` bug",39069044,closed,0,,,4,2022-04-09T03:02:40Z,2022-10-25T22:40:15Z,2022-04-12T02:12:39Z,CONTRIBUTOR,,0,pydata/xarray/pulls/6461," - [x] Closes #6444 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Fixes a bug introduced by #4687 where passing a non-xarray object to `x` in `xr.where(cond, x, y, keep_attrs=True)` caused a failure. The `keep_attrs` callable passed to `merge_attrs()` tries to access the attributes of `x` which do not exist in this case. This fix just checks to make sure `x` has attributes, and if not will pass through `keep_attrs=True`.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6461/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1359368857,PR_kwDOAMm_X84-PSvu,6978,fix passing of curvefit kwargs,39069044,open,0,,,5,2022-09-01T20:26:01Z,2022-10-11T18:50:45Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/6978," - [x] Closes #6891 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6978/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1043746973,PR_kwDOAMm_X84uC1vs,5933,Reimplement `.polyfit()` with `apply_ufunc`,39069044,open,0,,,6,2021-11-03T15:29:58Z,2022-10-06T21:42:09Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/5933,"- [x] Closes #4554 - [x] Closes #5629 - [x] Closes #5644 - [ ] Tests added - [x] Passes `pre-commit run --all-files` - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Reimplement `polyfit` using `apply_ufunc` rather than `dask.array.linalg.lstsq`. This should solve a number of issues with memory usage and chunking that were reported on the current version of `polyfit`. The main downside is that variables chunked along the fitting dimension cannot be handled with this approach. There is a bunch of fiddly code here for handling the differing outputs from `np.polyfit` depending on the values of the `full` and `cov` args. Depending on the performance implications, we could simplify some by keeping these in `apply_ufunc` and dropping later. Much of this parsing would still be required though, because the only way to get the covariances is to set `cov=True, full=False`. A few minor departures from the previous implementation: 1. The `rank` and `singular_values` diagnostic variables returned by `np.polyfit` are now returned on a pointwise basis, since these can change depending on skipped nans. `np.polyfit` also returns the `rcond` used for each fit which I've included here. 2. As mentioned above, this breaks fitting done along a chunked dimension. To avoid regression, we could set `allow_rechunk=True` and warn about memory implications. 3. Changed default `skipna=True`, since the previous behavior seemed to be a limitation of the computational method. 4. For consistency with the previous version, I included a `transpose` operation to put `degree` as the first dimension. This is arbitrary though, and actually the opposite of how `curvefit` returns ordering. So we could match up with `curvefit` but it would be breaking for polyfit. No new tests have been added since the previous suite was fairly comprehensive. Would be great to get some performance reports on real-world data such as the climate model detrending application in #5629.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5933/reactions"", ""total_count"": 2, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1381297782,PR_kwDOAMm_X84_XseG,7063,Better dtype preservation for rolling mean on dask array,39069044,closed,0,,,1,2022-09-21T17:59:07Z,2022-09-22T22:06:08Z,2022-09-22T22:06:08Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7063," - [x] Closes #7062 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This just tests to make sure we at least get the same dtype whether we have a numpy or dask array.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7063/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 1380016376,PR_kwDOAMm_X84_TlHf,7060,More informative error for non-existent zarr store,39069044,closed,0,,,2,2022-09-20T21:27:35Z,2022-09-20T22:38:45Z,2022-09-20T22:38:45Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7060,"- [x] Closes #6484 - [x] Tests added I've often been tripped up by the stack trace noted in #6484. This PR changes two things: 1. Handles the zarr `GroupNotFoundError` error with a more informative `FileNotFoundError`, displaying the path where we didn't find a zarr store. 2. Moves the consolidated metadata warning to after the step of successfully opening the zarr with non-consolidated metadata. This way the warning isn't shown if we are actually trying to open a non-existent zarr store, in which case we only get the error above and no warning.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7060/reactions"", ""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 797302408,MDExOlB1bGxSZXF1ZXN0NTY0MzM0ODQ1,4849,Basic curvefit implementation,39069044,closed,0,,,12,2021-01-30T01:28:16Z,2021-03-31T16:55:53Z,2021-03-31T16:55:53Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4849," - [x] Closes #4300 - [x] Tests added - [x] Passes `pre-commit run --all-files` - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [x] New functions/methods are listed in `api.rst` This is a simple implementation of a more general curve-fitting API as discussed in #4300, using the existing scipy `curve_fit` functionality wrapped with `apply_ufunc`. It works for arbitrary user-supplied 1D functions that ingest numpy arrays. Formatting and nomenclature of the outputs was largely copied from `.polyfit`, but could probably be improved.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4849/reactions"", ""total_count"": 5, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull