home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 444204957

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1603#issuecomment-444204957 https://api.github.com/repos/pydata/xarray/issues/1603 444204957 MDEyOklzc3VlQ29tbWVudDQ0NDIwNDk1Nw== 1217238 2018-12-04T18:25:33Z 2018-12-04T18:25:33Z MEMBER

Sorry for maybe asking this again but I'm a bit confused now: is there any good reason of supporting "multiple single indexes" along the same dimension?

After all, perhaps better defaults would be to set indexes (pandas.Index) only for 1-d coordinates matching dimension names, like it is the case now.

If you want a different behavior, then you need to use .set_index(), which would raise if it results in multiple single indexes along a dimension. We could also add a new indexes argument to the Dataset / DataArray constructors to save some typing (and avoid the creation of in-memory pandas.Index for very long coordinates if an out-of-core alternative is later supported).

I discussed this is a little bit above in https://github.com/pydata/xarray/issues/1603#issuecomment-442661526, under "MultiIndex as part of the data schema".

I agree that the default behavior should still be to create automatic indexes only for 1d coordinates matching dimension names. But we still will have (rare?) cases where "multiple single indexes" could arise from combining arguments with different indexes.

For example, suppose the station dimension has an index for station_name in one dataset and city in another. Should the result be: - A MultiIndex with levels station_name and city? This would be most useful for future operations. - Two individual indexes for station_name and city? This would be the cheapest result to construct. - An error? This is arguably too strict, because there are no conflicts in either of the indexes.

I guess the error is probably the best idea.

Where does come from array([0, 1])? I wouldn't have been surprised if a KeyError was raised instead. Perhaps this specific case was initially for backward compatibility when the "dimensions without indexes" feature has been introduced, but it was a long time ago and I'm not sure this is still necessary.

This is indeed the historical genesis, but I agree that this is confusing and we should deprecate/remove it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  262642978
Powered by Datasette · Queries took 0.695ms · About: xarray-datasette