issues: 1812301185
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1812301185 | I_kwDOAMm_X85sBYWB | 8005 | Design for IntervalIndex | 2448579 | open | 0 | 5 | 2023-07-19T16:30:50Z | 2023-09-09T06:30:20Z | MEMBER | Is your feature request related to a problem?We should add a wrapper for The CF designCF "encoding" for intervals is to use bounds variables. There is an attribute ```python import numpy as np left = np.arange(0.5, 3.6, 1) right = np.arange(1.5, 4.6, 1) bounds = np.stack([left, right]) ds = xr.Dataset( {"data": ("x", [1, 2, 3, 4])}, coords={"x": ("x", [1, 2, 3, 4], {"bounds": "x_bounds"}), "x_bounds": (("bnds", "x"), bounds)}, ) ds ``` A fundamental problem with our current data model is that we lose We would also like to use the "bounds" to enable interval based indexing. Pandas IntervalIndexAll the indexing is easy to implement by wrapping pandas.IntervalIndex, but there is one limitation. Fundamental QuestionTo me, a core question is whether
Describe the solution you'd likeI've prototyped (2) [approach 1 in this notebook) following @benbovy's suggestion
```python
from xarray import Variable
from xarray.indexes import PandasIndex
class XarrayIntervalIndex(PandasIndex):
def __init__(self, index, dim, coord_dtype):
assert isinstance(index, pd.IntervalIndex)
# for PandasIndex
self.index = index
self.dim = dim
self.coord_dtype = coord_dtype
@classmethod
def from_variables(cls, variables, options):
assert len(variables) == 1
(dim,) = tuple(variables)
bounds = options["bounds"]
assert isinstance(bounds, (xr.DataArray, xr.Variable))
(axis,) = bounds.get_axis_num(set(bounds.dims) - {dim})
left, right = np.split(bounds.data, 2, axis=axis)
index = pd.IntervalIndex.from_arrays(left.squeeze(), right.squeeze())
coord_dtype = bounds.dtype
return cls(index, dim, coord_dtype)
def create_variables(self, variables):
from xarray.core.indexing import PandasIndexingAdapter
newvars = {self.dim: xr.Variable(self.dim, PandasIndexingAdapter(self.index))}
return newvars
def __repr__(self):
string = f"Xarray{self.index!r}"
return string
def to_pandas_index(self):
return self.index
@property
def mid(self):
return PandasIndex(self.index.right, self.dim, self.coord_dtype)
@property
def left(self):
return PandasIndex(self.index.right, self.dim, self.coord_dtype)
@property
def right(self):
return PandasIndex(self.index.right, self.dim, self.coord_dtype)
```
Describe alternatives you've consideredI've tried some approaches in this notebook |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8005/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |