home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 334229444

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1603#issuecomment-334229444 https://api.github.com/repos/pydata/xarray/issues/1603 334229444 MDEyOklzc3VlQ29tbWVudDMzNDIyOTQ0NA== 1217238 2017-10-04T17:27:44Z 2017-10-04T17:27:44Z MEMBER
  1. Use cases of the independent Index and dims Would it be general cases where dimension and index are independent? (It is the case only for MultiIndex and KDtree)?

We would still assign default indexes (using a normal pandas.Index) when you assign a 1D coordinate with matching name and dimension. But in general, yes, it seems like you should be able to make an index even for variables that aren't dimensions, including for a 1D variable whose name does not match a dimension. The rule would be that any coordinates can be part of an index.

Another aspect to consider how to handle alignment when you have indexes along non-dimension coordinates. Probably the most elegant rule would again be to check all indexed variables for exact matches.

Directly assigning indexes rather than using this default or set_index() would be an advanced feature, not recommended for everyday use. The main use case is routines which create a new xarray object based on an existing one, and want to re-use old indexes.

For performance reasons, we probably do not want to actually check the values of manually assigned indexes, although we should verify that the shape matches. (We would have a clear disclaimer that if you manually assign an index with mismatched values the behavior is not well defined.)

In principle, this data model would allow for two mostly equivalent indexing schemes: MultiIndex[time, space] vs two indexes Index[time] and Index[space]. We would need to figure out how to propagate and compare indexes like this. (I suppose if the coordinate values match, the result could have the union of all indexes from input arguments.)

  1. MultiIndex implementation In MultiIndex case, will a xarray object store a MultiIndex object and also the level variables as Variable objects (there will be some duplicates)?

Yes, this is a little unfortunate. We could potentially make a custom wrapper for use in IndexVariable._data on the level variabless that lazily computes values from the MultiIndex (similar to our LazilyIndexedArray class), but I'm not certain yet that this is necessary.

If indexes[dim] returns multiple Variables, which realizes a MultiIndex-like structure without pd.MultiIndex, indexes would be very different from dim, because a single dimension can have multiple indexes.

Every entry in indexes should be a single pandas.Index or subclass, including MultiIndex (possibly eventually allowing for index-like objects such as something based on a KDTree).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  262642978
Powered by Datasette · Queries took 0.721ms · About: xarray-datasette