home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 444132393

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1603#issuecomment-444132393 https://api.github.com/repos/pydata/xarray/issues/1603 444132393 MDEyOklzc3VlQ29tbWVudDQ0NDEzMjM5Mw== 4160723 2018-12-04T15:06:21Z 2018-12-04T15:19:08Z MEMBER

It occurs to me that for the case of "multiple single indexes" along the same dimension there is no good way to use them simultaneously for indexing/reindexing at the same time.

Sorry for maybe asking this again but I'm a bit confused now: is there any good reason of supporting "multiple single indexes" along the same dimension?

After all, perhaps better defaults would be to set indexes (pandas.Index) only for 1-d coordinates matching dimension names, like it is the case now.

If you want a different behavior, then you need to use .set_index(), which would raise if it results in multiple single indexes along a dimension. We could also add a new indexes argument to the Dataset / DataArray constructors to save some typing (and avoid the creation of in-memory pandas.Index for very long coordinates if an out-of-core alternative is later supported).

da[dim_name] should return all the indexes on that dimension

I think that one big source of confusion has been so far mixing coordinates/variables and indexes. These are really two separate concepts, and the indexes refactoring should address that IMHO.

For example, I think that da[some_name] should never return indexes but only coordinates (and/or data variables for Dataset). That would be much simpler.

Take for example

```python

da = xr.DataArray(np.random.rand(2, 2), ... dims=('one', 'two'), ... coords={'one_labels': ('one', ['a', 'b'])}) da <xarray.DataArray (one: 2, two: 2)> array([[ 0.536028, 0.291895], [ 0.682108, 0.926003]]) Coordinates: one_labels (one) <U1 'a' 'b' Dimensions without coordinates: one, two ```

I find it so weird being able to do this:

```python

da['one'] <xarray.DataArray 'one' (one: 2)> array([0, 1]) Coordinates: one_labels (one) <U1 'a' 'b' Dimensions without coordinates: one ```

Where does come from array([0, 1])? I wouldn't have been surprised if a KeyError was raised instead. Perhaps this specific case was initially for backward compatibility when the "dimensions without indexes" feature has been introduced, but it was a long time ago and I'm not sure this is still necessary.

I might be a good thing explicitly requiring da.set_index('one_labels') to enable indexing/alignment (edit: label indexing/alignment) along dimension one in the example above.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  262642978
Powered by Datasette · Queries took 0.813ms · About: xarray-datasette