home / github / pull_requests

Menu
  • Search all tables
  • GraphQL API

pull_requests: 1759283186

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
1759283186 PR_kwDOAMm_X85o3Ify 8809 closed 0 Pass variable name to `encode_zarr_variable` 39069044 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes https://github.com/xarray-contrib/xeofs/issues/148 - [x] Tests added - [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` The change from https://github.com/pydata/xarray/pull/8672 mostly fixed the issue of serializing a reset multiindex in the backends, but there was an additional niche issue that turned up in xeofs that was causing serialization to still fail on the zarr backend. The issue is that zarr is the only backend that uses a custom version of `encode_cf_variable` called `encode_zarr_variable`, and the way this gets called we don't pass through the `name` of the variable before running `ensure_not_multiindex`. As a minimal fix, this PR just passes `name` through as an additional arg to the general `encode_variable` function. See @benbovy's [comment](https://github.com/pydata/xarray/pull/8672#issuecomment-1929837384) that maybe we should actually unwrap the level coordinate in `reset_index` and clean up the checks in `ensure_not_multiindex`, but I wasn't able to get that working easily. The exact workflow this turned up in involves DataTree and looks like this: ```python import numpy as np import xarray as xr from datatree import DataTree # ND DataArray that gets stacked along a multiindex da = xr.DataArray(np.ones((3, 3)), coords={"dim1": [1, 2, 3], "dim2": [4, 5, 6]}) da = da.stack(feature=["dim1", "dim2"]) # Extract just the stacked coordinates for saving in a dataset ds = xr.Dataset(data_vars={"feature": da.feature}) # Reset the multiindex, which should make things serializable ds = ds.reset_index("feature") dt1 = DataTree() dt2 = DataTree(name="feature", data=ds) dt1["foo"] = dt2 # Somehow in this step, dt1.foo.feature.dim1.variable becomes an IndexVariable again print(type(dt1.foo.feature.dim1.variable)) # Works dt1.to_netcdf("test.nc", mode="w") # Fails dt1.to_zarr("test.zarr", mode="w") ``` But we can reproduce in xarray with the test added here. 2024-03-06T16:21:53Z 2024-04-03T14:26:49Z 2024-04-03T14:26:48Z   a54b0e6cf2911f7f1672266dffcf73494063d1a4     0 0fd34adefa43f2dc77ae39b875ef658d613b36f6 473b87f19e164e508566baf7c8750ac4cb5b50f7 CONTRIBUTOR   13221727 https://github.com/pydata/xarray/pull/8809  

Links from other tables

  • 0 rows from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 0.611ms