home / github / pull_requests

Menu
  • GraphQL API
  • Search all tables

pull_requests: 1048884296

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
1048884296 PR_kwDOAMm_X84-hLRI 7004 open 0 Rework PandasMultiIndex.sel internals 4160723 <!-- Feel free to remove check-list items aren't relevant to your change --> - [x] Closes #6838 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` This PR hopefully improves how are handled the labels that are provided for multi-index level coordinates in `.sel()`. More specifically, slices are handled in a cleaner way and it is now allowed to provide array-like labels. `PandasMultiIndex.sel()` relies on the underlying `pandas.MultiIndex` methods like this: - use ``get_loc`` when all levels are provided with each a scalar label (no slice, no array) - always drops the index and returns scalar coordinates for each multi-index level - use ``get_loc_level`` when only a subset of levels are provided with scalar labels only - may collapse one or more levels of the multi-index (dropped levels result in scalar coordinates) - if only one level remains: renames the dimension and the corresponding dimension coordinate - use ``get_locs`` for all other cases. - always keeps the multi-index and its coordinates (even if only one item or one level is selected) This yields a predictable behavior: as soon as one of the provided labels is a slice or array-like, the multi-index and all its level coordinates are kept in the result. Some cases illustrated below (I compare this PR with an older release due to the errors reported in #6838): ```python import xarray as xr import pandas as pd midx = pd.MultiIndex.from_product([list("abc"), range(4)], names=("one", "two")) ds = xr.Dataset(coords={"x": midx}) # <xarray.Dataset> # Dimensions: (x: 12) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' 'c' 'c' 'c' 'c' # * two (x) int64 0 1 2 3 0 1 2 3 0 1 2 3 # Data variables: # *empty* ``` ```python ds.sel(one="a", two=0) # this PR # # <xarray.Dataset> # Dimensions: () # Coordinates: # x object ('a', 0) # one <U1 'a' # two int64 0 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: () # Coordinates: # x object ('a', 0) # Data variables: # *empty* # ``` ```python ds.sel(one="a") # this PR: # # <xarray.Dataset> # Dimensions: (two: 4) # Coordinates: # * two (two) int64 0 1 2 3 # one <U1 'a' # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (two: 4) # Coordinates: # * two (two) int64 0 1 2 3 # Data variables: # *empty* # ``` ```python ds.sel(one=slice("a", "b")) # this PR # # <xarray.Dataset> # Dimensions: (x: 8) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' 'a' 'a' 'a' 'b' 'b' 'b' 'b' # * two (x) int64 0 1 2 3 0 1 2 3 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (two: 8) # Coordinates: # * two (two) int64 0 1 2 3 0 1 2 3 # Data variables: # *empty* # ``` ```python ds.sel(one="a", two=slice(1, 1)) # this PR # # <xarray.Dataset> # Dimensions: (x: 1) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'a' # * two (x) int64 1 # Data variables: # *empty* # # v2022.3.0 # # <xarray.Dataset> # Dimensions: (x: 1) # Coordinates: # * x (x) MultiIndex # - one (x) object 'a' # - two (x) int64 1 # Data variables: # *empty* # ``` ```python ds.sel(one=["b", "c"], two=[0, 2]) # this PR # # <xarray.Dataset> # Dimensions: (x: 4) # Coordinates: # * x (x) object MultiIndex # * one (x) object 'b' 'b' 'c' 'c' # * two (x) int64 0 2 0 2 # Data variables: # *empty* # # v2022.3.0 # # ValueError: Vectorized selection is not available along coordinate 'one' (multi-index level) # ``` 2022-09-07T14:57:29Z 2022-09-22T20:38:41Z     0a4b1aafbe66a857de627cf180eba8713ca9a85d     0 00baaddefae0a189874ca64d9f4be4d2d83cc744 5bec4662a7dd4330eca6412c477ca3f238323ed2 MEMBER   13221727 https://github.com/pydata/xarray/pull/7004  

Links from other tables

  • 1 row from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 0.711ms