home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 557257598 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • delgadom 2
  • ivirshup 2
  • TomNicholas 2
  • dcherian 1

author_association 3

  • MEMBER 3
  • CONTRIBUTOR 2
  • NONE 2

issue 1

  • Repeated coordinates leads to unintuitive (broken?) indexing behaviour · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1234643971 https://github.com/pydata/xarray/issues/3731#issuecomment-1234643971 https://api.github.com/repos/pydata/xarray/issues/3731 IC_kwDOAMm_X85JlywD delgadom 3698640 2022-09-01T18:36:30Z 2022-09-01T18:36:30Z CONTRIBUTOR

FWIW I could definitely see use cases for allowing something like this... I have used cumbersome/ugly workarounds to work with variance-covariance matrices etc. So I'm not weighing in on the "this should raise an error" debate. I got briefly excited when I saw it didn't raise an error, until everything started unraveling 🙃

{
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 1,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
1234639080 https://github.com/pydata/xarray/issues/3731#issuecomment-1234639080 https://api.github.com/repos/pydata/xarray/issues/3731 IC_kwDOAMm_X85Jlxjo delgadom 3698640 2022-09-01T18:31:08Z 2022-09-01T18:31:08Z CONTRIBUTOR

ooh this is a fun one! came across this issue when we stumbled across a pendantic case writing tests (H/T @brews). I expected this to "fail loudly in the constructor" but it doesn't. note that currently AFAICT you cannot use positional slicing to achieve an intuitive result - the behavior seems more undefined/unpredictable

```python

setup

import xarray as xr, pandas as pd, numpy as np da = xr.DataArray(np.arange(8).reshape(2, 2, 2), coords=[[0, 1], [0, 1], ['a', 'b']], dims=["ni", "ni", "shh"]) xarray seems to not know it has a problem:python In [4]: da Out[4]: <xarray.DataArray (ni: 2, shh: 2)> array([[[0, 1], [2, 3]],

   [[4, 5],
    [6, 7]]])

Coordinates: * ni (ni) int64 0 1 * shh (shh) <U1 'a' 'b' slicing (somewhat intuitively? slices along both dims):python In [5]: da.sel(ni=0) Out[5]: <xarray.DataArray (shh: 2)> array([0, 1]) Coordinates: ni int64 0 * shh (shh) <U1 'a' 'b' however, positional slicing (and any attempts I've made to handle the repeated dims differently) seems to have undefined behavior:python In [6]: da[0, :, :] # positional slicing along first dim works as expected(?) Out[6]: <xarray.DataArray (ni: 2, shh: 2)> array([[[0, 1], [2, 3]],

   [[4, 5],
    [6, 7]]])

Coordinates: * ni (ni) int64 0 1 * shh (shh) <U1 'a' 'b'

In [7]: da[:, 0, :] # positional slicing along second dim slices both dims Out[7]: <xarray.DataArray (shh: 2)> array([0, 1]) Coordinates: ni int64 0 * shh (shh) <U1 'a' 'b' ```

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
580999532 https://github.com/pydata/xarray/issues/3731#issuecomment-580999532 https://api.github.com/repos/pydata/xarray/issues/3731 MDEyOklzc3VlQ29tbWVudDU4MDk5OTUzMg== ivirshup 8238804 2020-02-01T06:22:37Z 2020-02-01T06:22:37Z NONE

This has also come up over in DimensionalData.jl, which I think is going for behavior I like. What I think would happen:

da.isel(dim='ambiguous_dim')

The selection is over all dimensions of that name.

da.mean(dim='ambiguous_dim')

The command is to reduce over dimensions of that name, the reduction should be performed over all dimensions with that name.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
580662115 https://github.com/pydata/xarray/issues/3731#issuecomment-580662115 https://api.github.com/repos/pydata/xarray/issues/3731 MDEyOklzc3VlQ29tbWVudDU4MDY2MjExNQ== TomNicholas 35968931 2020-01-31T09:45:52Z 2020-01-31T09:46:44Z MEMBER

Why not allow multiple dimensions with the same name? They can be disambiguated with positional indexing for when it matters.

I'm not sure it's that simple... What would you suggest the behaviour for da.isel(dim='ambiguous_dim') or da.mean(dim='ambiguous_dim') be?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
580534808 https://github.com/pydata/xarray/issues/3731#issuecomment-580534808 https://api.github.com/repos/pydata/xarray/issues/3731 MDEyOklzc3VlQ29tbWVudDU4MDUzNDgwOA== ivirshup 8238804 2020-01-31T01:07:28Z 2020-01-31T01:07:52Z NONE

Why not allow multiple dimensions with the same name? They can be disambiguated with positional indexing for when it matters. I think support for this would be useful for pairwise measures.

Here's a fun example/ current buggy behaviour:

```python import numpy as np import xarray as xr from string import ascii_letters

idx1 = xr.IndexVariable("dim1", [f"dim1-{i}" for i in ascii_letters[:10]]) idx2 = xr.IndexVariable("dim2", [f"dim2-{i}" for i in ascii_letters[:5]])

da1 = xr.DataArray(np.random.random_sample((10, 5)), coords=(idx1, idx2)) da2 = xr.DataArray(np.random.random_sample((5, 10)), coords=(idx2, idx1))

da1 @ da2

<xarray.DataArray ()>

array(13.06261098)

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
580308274 https://github.com/pydata/xarray/issues/3731#issuecomment-580308274 https://api.github.com/repos/pydata/xarray/issues/3731 MDEyOklzc3VlQ29tbWVudDU4MDMwODI3NA== dcherian 2448579 2020-01-30T15:31:12Z 2020-01-30T15:31:12Z MEMBER

Dupe of #2226 and #1499 . I agree with failing loudly in the constructor

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598
580289880 https://github.com/pydata/xarray/issues/3731#issuecomment-580289880 https://api.github.com/repos/pydata/xarray/issues/3731 MDEyOklzc3VlQ29tbWVudDU4MDI4OTg4MA== TomNicholas 35968931 2020-01-30T14:52:27Z 2020-01-30T14:52:27Z MEMBER

Thanks for this @ivirshup , I'm surprised at this too.

The problem seems to be that the DataArray you've managed to create breaks xarray's own data model! There should be one dim for each axis of the wrapped array, but

```python import xarray as xr import numpy as np

sample_idx = xr.IndexVariable("sample_id", ["a", "b", "c"]) da = xr.DataArray(np.eye(3), coords=(sample_idx, sample_idx) print(da) gives a dataarray object which somehow has only one dim while wrapping a 2D array! <xarray.DataArray (sample_id: 3)> array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) Coordinates: * sample_id (sample_id) <U1 'a' 'b' 'c' ``` Obviously xarray should have thrown you an error before allowing you to create this. It's no wonder the indexing is weird after this point.

I would have expected to get an array with two dims, which you can do by being more explicit: python da2d = xr.DataArray(np.eye(3), dims=['dim0', 'dim1'], coords=(sample_idx, sample_idx)) print(da2d) <xarray.DataArray (dim0: 3, dim1: 3)> array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) Coordinates: * dim0 (dim0) <U1 'a' 'b' 'c' * dim1 (dim1) <U1 'a' 'b' 'c' (the coordinates aren't named how you want yet which is also a problem but at least this has a number of dimensions consistent with the data its wrapping.)

Indexing that object behaves more like you (and I) would expect: ```python da.shape

(3, 3)

da[1, :].shape

(3,)

da.loc["a", :].shape

(3,)

da.loc[:, "a"].shape

(3,)

da[:, 1] <xarray.DataArray (dim0: 3)> array([0., 1., 0.]) Coordinates: * dim0 (dim0) <U1 'a' 'b' 'c' dim1 <U1 'b' ```

It also doesn't fit xarray's data model to have two coordinates along different dimensions with the same name as one another. I suggest that you create two separate coords (i.e. sample_idx0 and sample_idx1), and assign them to each dim. Then you should be able to do what you want without weird behaviour.

(We should also fix DataArray.__init__() so that you can't construct this)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Repeated coordinates leads to unintuitive (broken?) indexing behaviour 557257598

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 120.868ms · About: xarray-datasette