github: issue_comments: 6 rows where issue = 1175329407 sorted by updated

6 rows where issue = 1175329407 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1290454937	https://github.com/pydata/xarray/issues/6392#issuecomment-1290454937	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85M6seZ	benbovy 4160723	2022-10-25T12:19:52Z	2022-10-25T12:19:52Z	MEMBER	I'm thinking of only accepting one or more instances of Indexes as `indexes` argument in the Dataset and DataArray constructors. The only exception is when `fastpath=True` a mapping can be given directly. It is much easier to handle: just check that keys returned by `Indexes.variables` do no conflict with the coordinate names in the `coords` argument It is slightly safer: it requires the user to explicitly create an `Indexes` object, thus with less chance to accidentally provide coordinate variables and index objects that do not relate to each other (we could probably add some safe guards in the `Indexes` class itself) It is more convenient: an Xarray `Index` may provide a factory method that returns an instance of `Indexes` that we just need to pass as `indexes`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1260618693	https://github.com/pydata/xarray/issues/6392#issuecomment-1260618693	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85LI4PF	benbovy 4160723	2022-09-28T09:13:00Z	2022-09-28T12:52:01Z	MEMBER	How would we handle creating xarray objects from pandas objects where they have a multiindex? For `pandas.Series` / `pandas.DataFrame` objects, `DataArray.from_series()` / `Dataset.from_dataframe()` already expand multi-index levels as dimensions. For a `pandas.MultiIndex`, we could do like below but it is a bit tedious: ```python import pandas as pd import xarray as xr from xarray.indexes import PandasMultiIndex pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) idx = PandasMultiIndex(pd_idx, "x") indexes = {"x": idx, "foo": idx, "bar": idx} coords = idx.create_variables() ds = xr.Dataset(coords=coords, indexes=indexes) ``` For more convenience, we could add a class method to `PandasMultiIndex`, e.g., ```python this calls PandasMultiIndex.init() and PandasMultiIndex.create_variables() internally indexes, coords = PandasMultiIndex.from_pandas_index(pd_idx, "x") ds = xr.Dataset(coords=coords, indexes=indexes) ``` Instead of `indexes, coords` raw dictionaries, we could return an instance of the Indexes class (also returned by `Dataset.xindexes`), which encapsulates the coordinate variables: ```python xmidx = PandasMultiIndex.from_pandas_index(pd_idx, "x") ds = xr.Dataset(coords=xmidx.variables, indexes=xmidx) ``` For even more convenience, I think it might be reasonable to support special handling of `Indexes` instances given in Dataset / DataArray constructors and in `.update()`, i.e., ```python both cases below will implicitly add the coordinates found in `xmidx` (if there's no conflict with other coordinates) ds = xr.Dataset(indexes=xmidx) ds2 = xr.Dataset() ds2.update(xmidx) ``` The same approach could be used for `pandas.IntervalIndex` (as discussed in #4579).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1082497324	https://github.com/pydata/xarray/issues/6392#issuecomment-1082497324	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85AhZks	max-sixty 5635139	2022-03-30T00:32:48Z	2022-03-30T00:32:48Z	MEMBER	Thanks for the thoughtful reply @benbovy (This is a level down and you can make a decision later, so fine if you prefer to push the discussion.) How would we handle creating xarray objects from pandas objects where they have a multiindex? To what extent do you think this is this the "standard case" and we could default to it? `python idx = xr.PandasMultiIndex(pd_idx, "x") indexes = {"x": idx, "foo": idx, "bar": idx}`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1080738079	https://github.com/pydata/xarray/issues/6392#issuecomment-1080738079	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85AasEf	benbovy 4160723	2022-03-28T14:38:13Z	2022-03-28T14:38:13Z	MEMBER	What's the rationale for deprecating this? I think my experience with users of xarray is mostly those coming from pandas; for them interop is quite important. Yes I agree that interoperability with pandas is important. Providing pandas (multi-)indexes via `coords` is convenient and worked pretty well so far because (1) indexes and dimension coordinates were not clearly distinct concepts and (2) multi-index levels were not "real" coordinates. However, this is not the case anymore. Now that indexes are really distinct from coordinates, I'd rather expect the following behavior for the case of pandas multi-index: ```python pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) convert a pandas multi-index to a numpy array returns level values as tuples np.array(pd_idx) array([('a', 1), ('a', 2), ('b', 1), ('b', 2)], dtype=object) simply pass the index as a coordinate would treat it as an array-like, i.e., like numpy does xr.Dataset(coords={"x": pd_idx}) <xarray.Dataset> Dimensions: (x: 4) Coordinates: * x (x) object ('a', 1) ('a', 2) ('b', 1) ('b', 2) Data variables: empty ``` In this specific case, I'd favor consistency with how Numpy handles Pandas indexes over more convenient interoperability with Pandas. The array of tuple elements is not very useful, though. There should be ways to create Xarray objects with Pandas indexes, but I think it's better if we eventually pass them via `indexes` instead of via `coords`, or via both `indexes` and `coords` even if that's slightly less convenient. More generally, I don't know how will evolve the ecosystem in the future (how many custom Xarray indexes?). I wonder to which point in Xarray's API we should support special cases for Pandas (multi-)indexes compared to other kinds of indexes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1080007416	https://github.com/pydata/xarray/issues/6392#issuecomment-1080007416	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85AX5r4	max-sixty 5635139	2022-03-27T19:54:44Z	2022-03-27T19:54:44Z	MEMBER	I realize there's a lot here and I've been out of this thread for a bit, so please forgive any naive questions! I would suggest depreciating this behavior in favor of a more explicit (although more verbose) way to pass an existing pandas multi-index: What's the rationale for deprecating this? I think my experience with users of xarray is mostly those coming from pandas; for them interop is quite important. If there's a canonical way of transforming the index, it would be friendlier to do that automatically. ```python import pandas as pd import xarray as xr pd_idx = pd.MultiIndex.from_product([["a", "b"], [1, 2]], names=("foo", "bar")) idx = pd_idx ds = xr.Dataset(coords={"x": idx}) ``` i.e. ``` ds = xr.Dataset(coords=coords) ValueError: missing index(es) for coordinate(s): 'x', 'foo', 'bar' or create unindexed coordinates 'foo' and 'bar' and a 'x' coordinate with a single pandas index ``` I would have expected the later, both for `coords=coords` and for `coords=pd_idx` (again, with the disclaimer that I may be missing crucial parts of the puzzle here). Should we silently reorder the coordinates and/or indexes when the levels are not passed in the right order? It seems odd requiring mapping elements be passed in a given order. 👍	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407
1079981685	https://github.com/pydata/xarray/issues/6392#issuecomment-1079981685	https://api.github.com/repos/pydata/xarray/issues/6392	IC_kwDOAMm_X85AXzZ1	keewis 14808389	2022-03-27T17:39:59Z	2022-03-27T17:39:59Z	MEMBER	I wonder if it would help to have a custom type that unlike `tuple` is invalid for coordinates / data variables, but allows to reduce the redundancy? E.g. `python indexes = {xr.combined("lat", "lon"): idx, xr.combined("z", "x", "y"): multi_index})` This would be immediately normalized to: `python indexes = {"lat": idx, "lon": idx, "z": multi_index, "x": multi_index, "y": multi_index}`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Pass indexes to the Dataset and DataArray constructors 1175329407

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

6 rows where issue = 1175329407 sorted by updated_at descending

this calls PandasMultiIndex.init() and PandasMultiIndex.create_variables() internally

both cases below will implicitly add the coordinates found in xmidx

(if there's no conflict with other coordinates)

convert a pandas multi-index to a numpy array returns level values as tuples

array([('a', 1), ('a', 2), ('b', 1), ('b', 2)], dtype=object)

simply pass the index as a coordinate would treat it as an array-like, i.e., like numpy does

<xarray.Dataset>

Dimensions: (x: 4)

Coordinates:

* x (x) object ('a', 1) ('a', 2) ('b', 1) ('b', 2)

Data variables:

empty

ValueError: missing index(es) for coordinate(s): 'x', 'foo', 'bar'

or

create unindexed coordinates 'foo' and 'bar' and a 'x' coordinate with a single pandas index

Advanced export

both cases below will implicitly add the coordinates found in `xmidx`