home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1260618693

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6392#issuecomment-1260618693 https://api.github.com/repos/pydata/xarray/issues/6392 1260618693 IC_kwDOAMm_X85LI4PF 4160723 2022-09-28T09:13:00Z 2022-09-28T12:52:01Z MEMBER

How would we handle creating xarray objects from pandas objects where they have a multiindex?

For pandas.Series / pandas.DataFrame objects, DataArray.from_series() / Dataset.from_dataframe() already expand multi-index levels as dimensions.

For a pandas.MultiIndex, we could do like below but it is a bit tedious:

```python import pandas as pd import xarray as xr from xarray.indexes import PandasMultiIndex

pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) idx = PandasMultiIndex(pd_idx, "x")

indexes = {"x": idx, "foo": idx, "bar": idx} coords = idx.create_variables()

ds = xr.Dataset(coords=coords, indexes=indexes) ```

For more convenience, we could add a class method to PandasMultiIndex, e.g.,

```python

this calls PandasMultiIndex.init() and PandasMultiIndex.create_variables() internally

indexes, coords = PandasMultiIndex.from_pandas_index(pd_idx, "x")

ds = xr.Dataset(coords=coords, indexes=indexes) ```

Instead of indexes, coords raw dictionaries, we could return an instance of the Indexes class (also returned by Dataset.xindexes), which encapsulates the coordinate variables:

```python xmidx = PandasMultiIndex.from_pandas_index(pd_idx, "x")

ds = xr.Dataset(coords=xmidx.variables, indexes=xmidx) ```

For even more convenience, I think it might be reasonable to support special handling of Indexes instances given in Dataset / DataArray constructors and in .update(), i.e.,

```python

both cases below will implicitly add the coordinates found in xmidx

(if there's no conflict with other coordinates)

ds = xr.Dataset(indexes=xmidx)

ds2 = xr.Dataset() ds2.update(xmidx) ```

The same approach could be used for pandas.IntervalIndex (as discussed in #4579).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1175329407
Powered by Datasette · Queries took 0.648ms · About: xarray-datasette