home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1248518310

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/7031#issuecomment-1248518310 https://api.github.com/repos/pydata/xarray/issues/7031 1248518310 IC_kwDOAMm_X85KauCm 35968931 2022-09-15T19:24:26Z 2022-09-15T19:51:46Z MEMBER

Okay I think this design could work for slicing across boundaries:

```python from xarray.core.indexes import PandasIndex, IndexSelResult, _query_slice from xarray.core.indexing import _expand_slice

class PeriodicBoundaryIndex(PandasIndex): """ An index representing any 1D periodic numberline.

Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float
_min: float
_max: float

__slots__ = ("index", "dim", "coord_dtype", "period", "_max", "_min")

def __init__(self, *args, period=360, **kwargs):
    super().__init__(*args, **kwargs)
    self.period = period
    self._min = self.index.min()
    self._max = self.index.max()

@classmethod
def from_variables(self, variables, options):
    obj = super().from_variables(variables, options={})
    obj.period = options.get("period", obj.period)
    return obj

def _wrap_periodically(self, label_value: float) -> float:
    """Remaps an individual point label back to another inside the range."""
    return self._min + (label_value - self._max) % self.period

def _split_slice_across_boundary(self, label: slice) -> np.ndarray:
    """
    Splits a slice into two slices, one either side of the boundary,
    finds the corresponding indices, concatenates them, 
    and returns them ready to be passed to .isel().
    """
    first_slice = slice(label.start, self._max, label.step)
    second_slice = slice(self._min, self._wrap_periodically(label.stop), label.step)

    first_as_index_slice = _query_slice(self.index, first_slice)
    second_as_index_slice = _query_slice(self.index, second_slice)

    first_as_indices = _expand_slice(first_as_index_slice, self.index.size)
    second_as_indices = _expand_slice(second_as_index_slice, self.index.size)

    wrapped_indices = np.concatenate([first_as_indices, second_as_indices])
    return wrapped_indices

def sel(
    self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
    """Remaps labels outside of the indexes' range back to integer indices inside the range."""

    assert len(labels) == 1
    coord_name, label = next(iter(labels.items()))

    if isinstance(label, slice):
        # TODO enumerate all the possible cases
        if self._min < label.start < self._max and self._min < label.stop < self._max:
            # simple case of slice not crossing boundary
            wrapped_label = slice(
                self._wrap_periodically(label.start),
                self._wrap_periodically(label.stop),
            )
            return super().sel({coord_name: wrapped_label})
        elif self._min < label.start < self._max and label.start < self._max < label.stop:
            # nasty case of slice crossing boundary
            wrapped_indices = self._split_slice_across_boundary(label)
            return IndexSelResult({self.dim: wrapped_indices})
        else:
            # TODO there are many other cases to handle...
            raise NotImplementedError()
    else:
        # just a scalar
        wrapped_label = self._wrap_periodically(label)
        return super().sel({coord_name: wrapped_label}, method=method, tolerance=tolerance)

def __repr__(self) -> str:
    return f"PeriodicBoundaryIndex(period={self.period})"

```

```python world.sel(lon=slice(60, 120), method="nearest")

<xarray.DataArray (lon: 4)>

array([-0.71424378, -0.87270922, -0.9701637 , -0.99979417])

Coordinates:

* lon (lon) float64 60.0 80.0 100.0 120.0

```

This works even for slices that cross the dateline

```python world.sel(lon=slice(160, 210), method="nearest")

<xarray.DataArray (lon: 4)>

array([-0.85218366, -0.68526211, 0.68526211, 0.85218366])

Coordinates:

* lon (lon) float64 160.0 180.0 -180.0 -160.0

```

This isn't general yet, there are lots of edge cases this would fail on, but I think it shows that as long as each case is captured we always could use this approach to remap back to index values that do lie within the range? What do people think?

EDIT:

One could split it into two calls to isel and concatenate the result. Not sure if that's possible with the given interface.

I believe what I've done here is the closest thing to that that is possible with the given interface.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1372035441
Powered by Datasette · Queries took 1.007ms · About: xarray-datasette