home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1247731386

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/7031#issuecomment-1247731386 https://api.github.com/repos/pydata/xarray/issues/7031 1247731386 IC_kwDOAMm_X85KXt66 4160723 2022-09-15T08:03:50Z 2022-09-15T08:03:50Z MEMBER

Great @TomNicholas!

To avoid copying the body of PandasIndex.sel, couldn't you "just" do something like this?

```python class PeriodicBoundaryIndex(PandasIndex): """ An index representing any 1D periodic numberline.

Implementation subclasses a normal xarray PandasIndex object but intercepts indexer queries.
"""
period: float

def __init__(self, *args, period=360, **kwargs):
    super().__init__(*args, **kwargs)
    self.period = period

@classmethod
def from_variables(self, variables, options):
    obj = super().from_variables(variables, options={})
    obj.period = options.get("period", obj.period)
    return obj

def _wrap_periodically(self, label_value):
    return self.index.min() + (label_value - self.index.max()) % self.period

def sel(
    self, labels: dict[Any, Any], method=None, tolerance=None
) -> IndexSelResult:
    """Remaps labels outside of the indexes' range back to integer indices inside the range."""

    assert len(labels) == 1
    coord_name, label = next(iter(labels.items()))

    if isinstance(label, slice):
        wrapped_label = slice(
            self._wrap_periodically(label.start),
            self._wrap_periodically(label.stop),
        )
    else:
        wrapped_label = self._wrap_periodically(label)

    return super().sel({coord_name: wrapped_label})

```

Note: I also added period as an option, which is supported in #6971 but not yet well documented. Another way to pass options is via coordinate attributes, like in this FunctionalIndex example.

It should work in most cases I think:

```python lon_coord = xr.DataArray(data=np.linspace(-180, 180, 19), dims="lon") da = xr.DataArray(data=np.random.randn(19), dims="lon", coords={"lon": lon_coord})

note the period set here

world = da.drop_indexes("lon").set_xindex("lon", index_cls=PeriodicBoundaryIndex, period=360) ```

```python world.sel(lon=200, method="nearest")

<xarray.DataArray ()>

array(-0.86583185)

Coordinates:

lon float64 -160.0

world.sel(lon=[200, 200], method="nearest")

<xarray.DataArray (lon: 2)>

array([-0.86583185, -0.86583185])

Coordinates:

* lon (lon) float64 -160.0 -160.0

world.sel(lon=slice(180, 200), method="nearest")

<xarray.DataArray (lon: 2)>

array([-1.59829997, -0.86583185])

Coordinates:

* lon (lon) float64 -180.0 -160.0

```

There's likely more things to do for slices as you point out. I don't think either that it's possible to pass two slices to isel. Not sure how this could be handled, but probably the easiest is to raise for cases like world.sel(lon=slice(170, 190)).

If we really need more flexibility in sel without copying the whole body of PandasIndex.sel, we could indeed refactor PandasIndex to allow more customization in subclasses. We must be careful, though, as it may be harder to make changes without possibly breaking 3rd-party stuff.

Or like you suggest we could define some _pre_process / _post_process hooks. It's not obvious where to call those hooks, though. Before or after converting from/to Variable or DataArray? Before or after checking for slices? array or scalar? The ideal place may change from one index to another.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1372035441
Powered by Datasette · Queries took 1.091ms · About: xarray-datasette