pull_requests: 1759283186
This data as json
id | node_id | number | state | locked | title | user | body | created_at | updated_at | closed_at | merged_at | merge_commit_sha | assignee | milestone | draft | head | base | author_association | auto_merge | repo | url | merged_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1759283186 | PR_kwDOAMm_X85o3Ify | 8809 | closed | 0 | Pass variable name to `encode_zarr_variable` | 39069044 | <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes https://github.com/xarray-contrib/xeofs/issues/148 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` The change from https://github.com/pydata/xarray/pull/8672 mostly fixed the issue of serializing a reset multiindex in the backends, but there was an additional niche issue that turned up in xeofs that was causing serialization to still fail on the zarr backend. The issue is that zarr is the only backend that uses a custom version of `encode_cf_variable` called `encode_zarr_variable`, and the way this gets called we don't pass through the `name` of the variable before running `ensure_not_multiindex`. As a minimal fix, this PR just passes `name` through as an additional arg to the general `encode_variable` function. See @benbovy's [comment](https://github.com/pydata/xarray/pull/8672#issuecomment-1929837384) that maybe we should actually unwrap the level coordinate in `reset_index` and clean up the checks in `ensure_not_multiindex`, but I wasn't able to get that working easily. The exact workflow this turned up in involves DataTree and looks like this: ```python import numpy as np import xarray as xr from datatree import DataTree # ND DataArray that gets stacked along a multiindex da = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"]) # Extract just the stacked coordinates for saving in a dataset ds = xr.Dataset(data_vars={"feature": da.feature}) # Reset the multiindex, which should make things serializable ds = ds.reset_index("feature") dt1 = DataTree() dt2 = DataTree(name="feature", data=ds) dt1["foo"] = dt2 # Somehow in this step, dt1.foo.feature.dim1.variable becomes an IndexVariable again print(type(dt1.foo.feature.dim1.variable)) # Works dt1.to_netcdf("test.nc", mode="w") # Fails dt1.to_zarr("test.zarr", mode="w") ``` But we can reproduce in xarray with the test added here. | 2024-03-06T16:21:53Z | 2024-04-03T14:26:49Z | 2024-04-03T14:26:48Z | a54b0e6cf2911f7f1672266dffcf73494063d1a4 | 0 | 0fd34adefa43f2dc77ae39b875ef658d613b36f6 | 473b87f19e164e508566baf7c8750ac4cb5b50f7 | CONTRIBUTOR | 13221727 | https://github.com/pydata/xarray/pull/8809 |
Links from other tables
- 0 rows from pull_requests_id in labels_pull_requests