home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 494906646 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • TomNicholas 4
  • dcherian 3
  • friedrichknuth 1

author_association 2

  • MEMBER 7
  • NONE 1

issue 1

  • xr.combine_nested() fails when passed nested DataSets · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
584975960 https://github.com/pydata/xarray/issues/3315#issuecomment-584975960 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDU4NDk3NTk2MA== friedrichknuth 10554254 2020-02-12T01:46:00Z 2020-02-12T01:46:00Z NONE

Few observations after looking at the default flags for concat:

python xr.concat( objs, dim, data_vars='all', coords='different', compat='equals', positions=None, fill_value=<NA>, join='outer', )

The description of compat='equals' indicates combining DataArrays with different names should fail: 'equals': all values and dimensions must be the same. (though I am not entirely sure what is meant by values... I assume this perhaps generically means keys?)

Another option is compat='identical' which is described as: 'identical': all values, dimensions and attributes must be the same. Using this flag will cause the operation to fail, as one would expect from the description...

```python objs = [xr.DataArray([0], dims='x', name='a'), xr.DataArray([1], dims='x', name='b')]

xr.concat(objs, dim='x', compat='identical') ```

python ValueError: array names not identical

... and is the case for concat on Datasets, as previously shown by @TomNicholas

``` objs = [xr.Dataset({'a': ('x', [0])}), xr.Dataset({'b': ('x', [0])})]

xr.concat(objs, dim='x') ```

python ValueError: 'a' is not present in all datasets.

However, 'identical': all values, dimensions and **attributes** must be the same. doesn't quite seem to be the case for DataArrays, as

```python objs = [xr.DataArray([0], dims='x', name='a', attrs={'foo':1}), xr.DataArray([1], dims='x', name='a', attrs={'bar':2})]

xr.concat(objs, dim='x', compat='identical') ``` succeeds with

python <xarray.DataArray 'a' (x: 2)> array([0, 1]) Dimensions without coordinates: x Attributes: foo: 1

but again fails on Datasets, as one would expect from the description.

```python ds1 = xr.Dataset({'a': ('x', [0])}) ds1.attrs['foo'] = 'example attribute'

ds2 = xr.Dataset({'a': ('x', [1])}) ds2.attrs['bar'] = 'example attribute'

objs = [ds1,ds2] xr.concat(objs, dim='x',compat='identical') ```

python ValueError: Dataset global attributes not equal.

Also had a look at compat='override', which will override an attrs inconsistency but not a naming one when applied to Datasets. Works as expected on DataArrays. It is described as 'override': skip comparing and pick variable from first dataset.

Potential resolutions:

  1. 'identical' should raise an error when attributes are not the same for DataArrays

  2. 'equals' should raise an error when DataArray names are not identical (unless one is None, which works with Datasets and seems fine to be replaced)

  3. 'override' should override naming inconsistencies when combining DataSets.

Final thought: perhaps promoting to Dataset when all requirements are met for a DataArray to be considered as such, might simplify keeping operations and checks consistent?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
535061773 https://github.com/pydata/xarray/issues/3315#issuecomment-535061773 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzNTA2MTc3Mw== TomNicholas 35968931 2019-09-25T14:53:34Z 2019-09-25T15:00:27Z MEMBER

Really? Okay, so that means that currently we don't treat a named DataArray and a single-variable Dataset as if they are the same. For example I would have expected these two operations to give the same result: python objs = [DataArray([0], dims='x', name='a'), DataArray([0], dims='x', name='b')] concat(objs, dim='x') <xarray.DataArray 'a' (x: 2)> array([0, 0]) Dimensions without coordinates: x python objs = [Dataset({'a': ('x', [0])}), Dataset({'b': ('x', [0])})] concat(objs, dim='x') ``` self = Frozen(OrderedDict([('b', <xarray.Variable (x: 1)> array([0]))])), key = 'a'

def __getitem__(self, key: K) -> V:
  return self.mapping[key]

E KeyError: 'a'

xarray/core/utils.py:385: KeyError ```

Is this what we want to do? Surely the first one should also fail, else this is counter-intuitive. I think of a named DataArray and a single-variable Dataset as being the same thing, just a single physical variable? @shoyer am I misunderstanding xarray's data model here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
535052060 https://github.com/pydata/xarray/issues/3315#issuecomment-535052060 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzNTA1MjA2MA== dcherian 2448579 2019-09-25T14:32:48Z 2019-09-25T14:32:48Z MEMBER

concat ignores DataArray.name. I don't know if we should consider it a bug or a feature :)

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  xr.combine_nested() fails when passed nested DataSets 494906646
535010456 https://github.com/pydata/xarray/issues/3315#issuecomment-535010456 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzNTAxMDQ1Ng== TomNicholas 35968931 2019-09-25T13:00:13Z 2019-09-25T13:00:13Z MEMBER

Okay something has definitely gone wrong here.

My intention with that test was to check that the order of operations doesn't matter, but you're right that the test as written makes no sense. It would probably be a good idea to remove this test and check that property correctly by adding a second assert to the (poorly-named) test_auto_combine_2d: ```python

Prove it works symmetrically

datasets = [[ds(0), ds(3)], [ds(1), ds(4)], [ds(2), ds(5)]] result = combine_nested(datasets, concat_dim=["dim2", "dim1"]) assert_equal(result, expected) ``` (This passes fine)

However, that still leaves the question of why is this nonsensical test passing?

I think it's because concat is not failing when it should - that test boils down to calling concat on those DataArrays (called from _combine_1d internally). Surely concat should fail when you ask it to do this, because how can you concatenate two different variables? ```python da1 = DataArray(name="a", data=[[0]], dims=["x", "y"]) da2 = DataArray(name="b", data=[[1]], dims=["x", "y"])

result = concat([da1, da2], dim="x") However it doesn't fail, instead it gives this!: <xarray.DataArray 'a' (x: 2, y: 1)> array([[0], [1]]) Dimensions without coordinates: x, y `` Where has'b'gone?! This is the reason thattest_concat_name_symmetry` gives such a weird result.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
532794909 https://github.com/pydata/xarray/issues/3315#issuecomment-532794909 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzMjc5NDkwOQ== TomNicholas 35968931 2019-09-18T17:53:43Z 2019-09-18T17:53:43Z MEMBER

Hmm I can look at this properly at the weekend but in the meantime the logic was motivated by discussion in #2777. If the test doesn't make sense in that context then it's not right.

On Wed, 18 Sep 2019, 18:16 Deepak Cherian, notifications@github.com wrote:

Yes/

https://github.com/pydata/xarray/blob/fddced063b7ecbea6254dc1008bb4db15a5d9304/xarray/tests/test_combine.py#L467-L478

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3315?email_source=notifications&email_token=AISNPI6OZK7YES6JUSDSWCLQKJO7VA5CNFSM4IXWOSZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7AZOBI#issuecomment-532780805, or mute the thread https://github.com/notifications/unsubscribe-auth/AISNPI3UMSS4MAOZKQLNRKTQKJO7VANCNFSM4IXWOSZQ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
532780805 https://github.com/pydata/xarray/issues/3315#issuecomment-532780805 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzMjc4MDgwNQ== dcherian 2448579 2019-09-18T17:16:36Z 2019-09-18T17:16:36Z MEMBER

Yes/

https://github.com/pydata/xarray/blob/fddced063b7ecbea6254dc1008bb4db15a5d9304/xarray/tests/test_combine.py#L467-L478

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
532778982 https://github.com/pydata/xarray/issues/3315#issuecomment-532778982 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzMjc3ODk4Mg== TomNicholas 35968931 2019-09-18T17:11:42Z 2019-09-18T17:11:42Z MEMBER

Sorry when you say expected result are you referring to a particular unit test?

On Wed, 18 Sep 2019, 18:07 Deepak Cherian, notifications@github.com wrote:

This honestly makes no sense to me.

da1 = xr.DataArray(name="a", data=[[0]], dims=["x", "y"]) da2 = xr.DataArray(name="b", data=[[1]], dims=["x", "y"]) da3 = xr.DataArray(name="a", data=[[2]], dims=["x", "y"]) da4 = xr.DataArray(name="b", data=[[3]], dims=["x", "y"]) xr.combine_nested([[da1, da2], [da3, da4]], concat_dim=["x", "y"])

These are dataarrays with two different names. Why is this the expected result?

<xarray.DataArray 'a' (x: 2, y: 2)> array([[0, 1], [2, 3]]) Dimensions without coordinates: x, y

That error arises because it's trying to concatenate data_vars a and b but there are datasets that don't have a. If you set those DataArrays to have the same name, this will work.

da1 = xr.DataArray(name="a", data=[[0]], dims=["x", "y"]) da2 = xr.DataArray(name="a", data=[[1]], dims=["x", "y"]) da3 = xr.DataArray(name="a", data=[[2]], dims=["x", "y"]) da4 = xr.DataArray(name="a", data=[[3]], dims=["x", "y"])

ds1 = da1.to_dataset() ds2 = da2.to_dataset() ds3 = da3.to_dataset() ds4 = da4.to_dataset() xr.combine_nested([[ds1, ds2], [ds3, ds4]], concat_dim=["x", "y"])

<xarray.Dataset> Dimensions: (x: 2, y: 2) Dimensions without coordinates: x, y Data variables: a (x, y) int64 0 1 2 3

ping @TomNicholas https://github.com/TomNicholas

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3315?email_source=notifications&email_token=AISNPI36JGXYGON2QZCX4RLQKJN6DA5CNFSM4IXWOSZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7AYT7Y#issuecomment-532777471, or mute the thread https://github.com/notifications/unsubscribe-auth/AISNPIZPWVSS7SU2A3AEUMLQKJN6DANCNFSM4IXWOSZQ .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646
532777471 https://github.com/pydata/xarray/issues/3315#issuecomment-532777471 https://api.github.com/repos/pydata/xarray/issues/3315 MDEyOklzc3VlQ29tbWVudDUzMjc3NzQ3MQ== dcherian 2448579 2019-09-18T17:07:39Z 2019-09-18T17:07:39Z MEMBER

This honestly makes no sense to me. da1 = xr.DataArray(name="a", data=[[0]], dims=["x", "y"]) da2 = xr.DataArray(name="b", data=[[1]], dims=["x", "y"]) da3 = xr.DataArray(name="a", data=[[2]], dims=["x", "y"]) da4 = xr.DataArray(name="b", data=[[3]], dims=["x", "y"]) xr.combine_nested([[da1, da2], [da3, da4]], concat_dim=["x", "y"])

These are dataarrays with two different names. Why is this the expected result? <xarray.DataArray 'a' (x: 2, y: 2)> array([[0, 1], [2, 3]]) Dimensions without coordinates: x, y

That error arises because it's trying to concatenate data_vars a and b but there are datasets that don't have a. If you set those DataArrays to have the same name, this will work.

``` da1 = xr.DataArray(name="a", data=[[0]], dims=["x", "y"]) da2 = xr.DataArray(name="a", data=[[1]], dims=["x", "y"]) da3 = xr.DataArray(name="a", data=[[2]], dims=["x", "y"]) da4 = xr.DataArray(name="a", data=[[3]], dims=["x", "y"])

ds1 = da1.to_dataset() ds2 = da2.to_dataset() ds3 = da3.to_dataset() ds4 = da4.to_dataset() xr.combine_nested([[ds1, ds2], [ds3, ds4]], concat_dim=["x", "y"]) ```

<xarray.Dataset> Dimensions: (x: 2, y: 2) Dimensions without coordinates: x, y Data variables: a (x, y) int64 0 1 2 3

ping @TomNicholas

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.combine_nested() fails when passed nested DataSets 494906646

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 643.417ms · About: xarray-datasette