home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 580289880

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3731#issuecomment-580289880 https://api.github.com/repos/pydata/xarray/issues/3731 580289880 MDEyOklzc3VlQ29tbWVudDU4MDI4OTg4MA== 35968931 2020-01-30T14:52:27Z 2020-01-30T14:52:27Z MEMBER

Thanks for this @ivirshup , I'm surprised at this too.

The problem seems to be that the DataArray you've managed to create breaks xarray's own data model! There should be one dim for each axis of the wrapped array, but

```python import xarray as xr import numpy as np

sample_idx = xr.IndexVariable("sample_id", ["a", "b", "c"]) da = xr.DataArray(np.eye(3), coords=(sample_idx, sample_idx) print(da) gives a dataarray object which somehow has only one dim while wrapping a 2D array! <xarray.DataArray (sample_id: 3)> array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) Coordinates: * sample_id (sample_id) <U1 'a' 'b' 'c' ``` Obviously xarray should have thrown you an error before allowing you to create this. It's no wonder the indexing is weird after this point.

I would have expected to get an array with two dims, which you can do by being more explicit: python da2d = xr.DataArray(np.eye(3), dims=['dim0', 'dim1'], coords=(sample_idx, sample_idx)) print(da2d) <xarray.DataArray (dim0: 3, dim1: 3)> array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) Coordinates: * dim0 (dim0) <U1 'a' 'b' 'c' * dim1 (dim1) <U1 'a' 'b' 'c' (the coordinates aren't named how you want yet which is also a problem but at least this has a number of dimensions consistent with the data its wrapping.)

Indexing that object behaves more like you (and I) would expect: ```python da.shape

(3, 3)

da[1, :].shape

(3,)

da.loc["a", :].shape

(3,)

da.loc[:, "a"].shape

(3,)

da[:, 1] <xarray.DataArray (dim0: 3)> array([0., 1., 0.]) Coordinates: * dim0 (dim0) <U1 'a' 'b' 'c' dim1 <U1 'b' ```

It also doesn't fit xarray's data model to have two coordinates along different dimensions with the same name as one another. I suggest that you create two separate coords (i.e. sample_idx0 and sample_idx1), and assign them to each dim. Then you should be able to do what you want without weird behaviour.

(We should also fix DataArray.__init__() so that you can't construct this)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  557257598
Powered by Datasette · Queries took 0.483ms · About: xarray-datasette