home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1080738079

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6392#issuecomment-1080738079 https://api.github.com/repos/pydata/xarray/issues/6392 1080738079 IC_kwDOAMm_X85AasEf 4160723 2022-03-28T14:38:13Z 2022-03-28T14:38:13Z MEMBER

What's the rationale for deprecating this? I think my experience with users of xarray is mostly those coming from pandas; for them interop is quite important.

Yes I agree that interoperability with pandas is important. Providing pandas (multi-)indexes via coords is convenient and worked pretty well so far because (1) indexes and dimension coordinates were not clearly distinct concepts and (2) multi-index levels were not "real" coordinates. However, this is not the case anymore.

Now that indexes are really distinct from coordinates, I'd rather expect the following behavior for the case of pandas multi-index:

```python pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar"))

convert a pandas multi-index to a numpy array returns level values as tuples

np.array(pd_idx)

array([('a', 1), ('a', 2), ('b', 1), ('b', 2)], dtype=object)

simply pass the index as a coordinate would treat it as an array-like, i.e., like numpy does

xr.Dataset(coords={"x": pd_idx})

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object ('a', 1) ('a', 2) ('b', 1) ('b', 2)

Data variables:

empty

```

In this specific case, I'd favor consistency with how Numpy handles Pandas indexes over more convenient interoperability with Pandas. The array of tuple elements is not very useful, though. There should be ways to create Xarray objects with Pandas indexes, but I think it's better if we eventually pass them via indexes instead of via coords, or via both indexes and coords even if that's slightly less convenient.

More generally, I don't know how will evolve the ecosystem in the future (how many custom Xarray indexes?). I wonder to which point in Xarray's API we should support special cases for Pandas (multi-)indexes compared to other kinds of indexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1175329407
Powered by Datasette · Queries took 0.64ms · About: xarray-datasette