html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/7214#issuecomment-1295283938,https://api.github.com/repos/pydata/xarray/issues/7214,1295283938,IC_kwDOAMm_X85NNHbi,1217238,2022-10-28T17:49:10Z,2022-10-28T17:49:10Z,MEMBER,"> > Explicitly providing indexes is an advanced user feature.
>
> Agreed. However, `xr.Dataset(coords={""x"": pandas_midx})` is something that presumably a lot of users rely on (it is used extensively in Xarray's tests) and that we should really deprecate IMO. If we don't provide a convenient alternative, I expect many of those users will complain.
I agree -- we should support this for backwards compatibility (even if we deprecate it).
> > it's easier to explicitly manipulate indexes in the form of a dict
>
> While generally I also prefer handling plain `dict` objects over custom dict-like objects, here I don't see much reasons of manipulating Xarray index objects independently of their coordinate variables. `Indexes` allows keeping them tied together, and it is already returned by `.xindexes`.
>
> EDIT -- For more context: initially an `Indexes` object was almost equivalent to a `Frozen(obj._indexes)`. In #5692 I tried hard and struggled to keep dealing with separate dicts of indexes and indexed variables, but in the end it made things much easier to encapsulate the variables in `Indexes`, which is also used internally in different places.
OK, this totally makes sense.
I don't love that it is possible to express invalid states in Xarray's data model. This motivated the creation of `assert_internal_invariants` and currently mostly is a concern for Xarray's own developers, but when we exposes the `indexes` argument, it will be easier for users to make the same sort of errors.
I wonder if we should consider the broader refactor of merging the `Indexes` and `Coordinates` objects, and expose the constructor as a public API. For clarity, I'll call it `CoordinatesAndIndexes` for now, but it could likely reuse the public name of `Coordinates`.
This would have a number of benefits:
1. It's impossible to provide inconsistent `coords` and `indexes`, because there is no separate `indexes` argument.
2. Likewise, it is impossible to create inconsistent coordinates and indexes on an existing Xarray object.
3. All the logic for verifying consistent coords and indexes can go in one place, shared between Dataset/DataArray. (Yes, it would be annoying to refactor `Dataset` to merge in variables from `CoordinatesAndIndexes` rather than the current separate `Dataset._variables`)
4. The public API also becomes clearer: if users want default indexes, they can pass a dict of variables into `coords`. If they want to copy indexes from another object, they can pass in a `CoordinatesAndIndexes` object (either from another Xarray object or constructed directly).","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1422543378
https://github.com/pydata/xarray/pull/7214#issuecomment-1293909730,https://api.github.com/repos/pydata/xarray/issues/7214,1293909730,IC_kwDOAMm_X85NH37i,1217238,2022-10-27T18:28:40Z,2022-10-27T18:28:40Z,MEMBER,"> I'm thinking of only accepting one or more instances of [Indexes](https://github.com/pydata/xarray/blob/e678a1d7884a3c24dba22d41b2eef5d7fe5258e7/xarray/core/indexes.py#L1030) as indexes argument in the Dataset and DataArray constructors
I would lean against this, only because it's easier to explicitly manipulate indexes in the form of a `dict` than an `xarray.Indexes` object.
Explicitly providing indexes is an advanced user feature. I think it's OK to require users to do a bit more work in this case and to not necessarily do consistency checks (beyond verifying that the coordinate variables exist).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1422543378