pull_requests: 1048884296
This data as json
id | node_id | number | state | locked | title | user | body | created_at | updated_at | closed_at | merged_at | merge_commit_sha | assignee | milestone | draft | head | base | author_association | auto_merge | repo | url | merged_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1048884296 | PR_kwDOAMm_X84-hLRI | 7004 | open | 0 | Rework PandasMultiIndex.sel internals | 4160723 | <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6838 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR hopefully improves how are handled the labels that are provided for multi-index level coordinates in `.sel()`. More specifically, slices are handled in a cleaner way and it is now allowed to provide array-like labels. `PandasMultiIndex.sel()` relies on the underlying `pandas.MultiIndex` methods like this: - use ``get_loc`` when all levels are provided with each a scalar label (no slice, no array) - always drops the index and returns scalar coordinates for each multi-index level - use ``get_loc_level`` when only a subset of levels are provided with scalar labels only - may collapse one or more levels of the multi-index (dropped levels result in scalar coordinates) - if only one level remains: renames the dimension and the corresponding dimension coordinate - use ``get_locs`` for all other cases. - always keeps the multi-index and its coordinates (even if only one item or one level is selected) This yields a predictable behavior: as soon as one of the provided labels is a slice or array-like, the multi-index and all its level coordinates are kept in the result. Some cases illustrated below (I compare this PR with an older release due to the errors reported in #6838): ```python import xarray as xr import pandas as pd midx = pd.MultiIndex.from_product([list("abc"), range(4)], names=("one", "two")) ds = xr.Dataset(coords={"x": midx}) # <xarray.Dataset> # Dimensions: (x: 12) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c' # * two (x) int64 0 1 2 3 0 1 2 3 0 1 2 3 # Data variables: # *empty* ``` ```python ds.sel(one="a", two=0) # this PR # # <xarray.Dataset> # Dimensions: () # Coordinates: # x object ('a', 0) # one <U1 'a' # two int64 0 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: () # Coordinates: # x object ('a', 0) # Data variables: # *empty* # ``` ```python ds.sel(one="a") # this PR: # # <xarray.Dataset> # Dimensions: (two: 4) # Coordinates: # * two (two) int64 0 1 2 3 # one <U1 'a' # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (two: 4) # Coordinates: # * two (two) int64 0 1 2 3 # Data variables: # *empty* # ``` ```python ds.sel(one=slice("a", "b")) # this PR # # <xarray.Dataset> # Dimensions: (x: 8) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' # * two (x) int64 0 1 2 3 0 1 2 3 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (two: 8) # Coordinates: # * two (two) int64 0 1 2 3 0 1 2 3 # Data variables: # *empty* # ``` ```python ds.sel(one="a", two=slice(1, 1)) # this PR # # <xarray.Dataset> # Dimensions: (x: 1) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' # * two (x) int64 1 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (x: 1) # Coordinates: # * x (x) MultiIndex # - one (x) object 'a' # - two (x) int64 1 # Data variables: # *empty* # ``` ```python ds.sel(one=["b", "c"], two=[0, 2]) # this PR # # <xarray.Dataset> # Dimensions: (x: 4) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'b' 'b' 'c' 'c' # * two (x) int64 0 2 0 2 # Data variables: # *empty* # # v2022.3.0 # # ValueError: Vectorized selection is not available along coordinate 'one' (multi-index level) # ``` | 2022-09-07T14:57:29Z | 2022-09-22T20:38:41Z | 0a4b1aafbe66a857de627cf180eba8713ca9a85d | 0 | 00baaddefae0a189874ca64d9f4be4d2d83cc744 | 5bec4662a7dd4330eca6412c477ca3f238323ed2 | MEMBER | 13221727 | https://github.com/pydata/xarray/pull/7004 |
Links from other tables
- 1 row from pull_requests_id in labels_pull_requests