html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3315#issuecomment-584975960,https://api.github.com/repos/pydata/xarray/issues/3315,584975960,MDEyOklzc3VlQ29tbWVudDU4NDk3NTk2MA==,10554254,2020-02-12T01:46:00Z,2020-02-12T01:46:00Z,NONE,"Few observations after looking at the default flags for `concat`: ```python xr.concat( objs, dim, data_vars='all', coords='different', compat='equals', positions=None, fill_value=, join='outer', ) ``` The description of `compat='equals'` indicates combining DataArrays with different names should fail: `'equals': all values and dimensions must be the same.` (though I am not entirely sure what is meant by `values`... I assume this perhaps generically means `keys`?) Another option is `compat='identical'` which is described as: `'identical': all values, dimensions and attributes must be the same.` Using this flag will cause the operation to fail, as one would expect from the description... ```python objs = [xr.DataArray([0], dims='x', name='a'), xr.DataArray([1], dims='x', name='b')] xr.concat(objs, dim='x', compat='identical') ``` ```python ValueError: array names not identical ``` ... and is the case for `concat` on Datasets, as previously shown by @TomNicholas ``` objs = [xr.Dataset({'a': ('x', [0])}), xr.Dataset({'b': ('x', [0])})] xr.concat(objs, dim='x') ``` ```python ValueError: 'a' is not present in all datasets. ``` However, `'identical': all values, dimensions and **attributes** must be the same.` doesn't quite seem to be the case for DataArrays, as ```python objs = [xr.DataArray([0], dims='x', name='a', attrs={'foo':1}), xr.DataArray([1], dims='x', name='a', attrs={'bar':2})] xr.concat(objs, dim='x', compat='identical') ``` succeeds with ```python array([0, 1]) Dimensions without coordinates: x Attributes: foo: 1 ``` but again fails on Datasets, as one would expect from the description. ```python ds1 = xr.Dataset({'a': ('x', [0])}) ds1.attrs['foo'] = 'example attribute' ds2 = xr.Dataset({'a': ('x', [1])}) ds2.attrs['bar'] = 'example attribute' objs = [ds1,ds2] xr.concat(objs, dim='x',compat='identical') ``` ```python ValueError: Dataset global attributes not equal. ``` Also had a look at `compat='override'`, which will override an `attrs` inconsistency but not a naming one when applied to Datasets. Works as expected on DataArrays. It is described as `'override': skip comparing and pick variable from first dataset`. Potential resolutions: 1. `'identical'` should raise an error when attributes are not the same for DataArrays 2. `'equals'` should raise an error when DataArray names are not identical (unless one is None, which works with Datasets and seems fine to be replaced) 3. `'override'` should override naming inconsistencies when combining DataSets. Final thought: perhaps promoting to Dataset when all requirements are met for a DataArray to be considered as such, might simplify keeping operations and checks consistent? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-535061773,https://api.github.com/repos/pydata/xarray/issues/3315,535061773,MDEyOklzc3VlQ29tbWVudDUzNTA2MTc3Mw==,35968931,2019-09-25T14:53:34Z,2019-09-25T15:00:27Z,MEMBER,"Really? Okay, so that means that currently we don't treat a named DataArray and a single-variable Dataset as if they are the same. For example I would have expected these two operations to give the same result: ```python objs = [DataArray([0], dims='x', name='a'), DataArray([0], dims='x', name='b')] concat(objs, dim='x') ``` ``` array([0, 0]) Dimensions without coordinates: x ``` ```python objs = [Dataset({'a': ('x', [0])}), Dataset({'b': ('x', [0])})] concat(objs, dim='x') ``` ``` self = Frozen(OrderedDict([('b', array([0]))])), key = 'a' def __getitem__(self, key: K) -> V: > return self.mapping[key] E KeyError: 'a' xarray/core/utils.py:385: KeyError ``` Is this what we want to do? Surely the first one should also fail, else this is counter-intuitive. I think of a named DataArray and a single-variable Dataset as being the same thing, just a single physical variable? @shoyer am I misunderstanding xarray's data model here?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-535052060,https://api.github.com/repos/pydata/xarray/issues/3315,535052060,MDEyOklzc3VlQ29tbWVudDUzNTA1MjA2MA==,2448579,2019-09-25T14:32:48Z,2019-09-25T14:32:48Z,MEMBER,`concat` ignores `DataArray.name`. I don't know if we should consider it a bug or a feature :),"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 1}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-535010456,https://api.github.com/repos/pydata/xarray/issues/3315,535010456,MDEyOklzc3VlQ29tbWVudDUzNTAxMDQ1Ng==,35968931,2019-09-25T13:00:13Z,2019-09-25T13:00:13Z,MEMBER,"Okay something has definitely gone wrong here. My intention with that test was to check that the order of operations doesn't matter, but you're right that the test as written makes no sense. It would probably be a good idea to remove this test and check that property correctly by adding a second assert to the (poorly-named) `test_auto_combine_2d`: ```python # Prove it works symmetrically datasets = [[ds(0), ds(3)], [ds(1), ds(4)], [ds(2), ds(5)]] result = combine_nested(datasets, concat_dim=[""dim2"", ""dim1""]) assert_equal(result, expected) ``` (This passes fine) However, that still leaves the question of why is this nonsensical test passing? I think it's because `concat` is not failing when it should - that test boils down to calling `concat` on those DataArrays (called from `_combine_1d` internally). Surely concat should fail when you ask it to do this, because how can you concatenate two different variables? ```python da1 = DataArray(name=""a"", data=[[0]], dims=[""x"", ""y""]) da2 = DataArray(name=""b"", data=[[1]], dims=[""x"", ""y""]) result = concat([da1, da2], dim=""x"") ``` However it doesn't fail, instead it gives this!: ``` array([[0], [1]]) Dimensions without coordinates: x, y ``` Where has `'b'` gone?! This is the reason that `test_concat_name_symmetry` gives such a weird result. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-532794909,https://api.github.com/repos/pydata/xarray/issues/3315,532794909,MDEyOklzc3VlQ29tbWVudDUzMjc5NDkwOQ==,35968931,2019-09-18T17:53:43Z,2019-09-18T17:53:43Z,MEMBER,"Hmm I can look at this properly at the weekend but in the meantime the logic was motivated by discussion in #2777. If the test doesn't make sense in that context then it's not right. On Wed, 18 Sep 2019, 18:16 Deepak Cherian, wrote: > Yes/ > > > https://github.com/pydata/xarray/blob/fddced063b7ecbea6254dc1008bb4db15a5d9304/xarray/tests/test_combine.py#L467-L478 > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-532780805,https://api.github.com/repos/pydata/xarray/issues/3315,532780805,MDEyOklzc3VlQ29tbWVudDUzMjc4MDgwNQ==,2448579,2019-09-18T17:16:36Z,2019-09-18T17:16:36Z,MEMBER,"Yes/ https://github.com/pydata/xarray/blob/fddced063b7ecbea6254dc1008bb4db15a5d9304/xarray/tests/test_combine.py#L467-L478","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-532778982,https://api.github.com/repos/pydata/xarray/issues/3315,532778982,MDEyOklzc3VlQ29tbWVudDUzMjc3ODk4Mg==,35968931,2019-09-18T17:11:42Z,2019-09-18T17:11:42Z,MEMBER,"Sorry when you say expected result are you referring to a particular unit test? On Wed, 18 Sep 2019, 18:07 Deepak Cherian, wrote: > This honestly makes no sense to me. > > da1 = xr.DataArray(name=""a"", data=[[0]], dims=[""x"", ""y""]) > da2 = xr.DataArray(name=""b"", data=[[1]], dims=[""x"", ""y""]) > da3 = xr.DataArray(name=""a"", data=[[2]], dims=[""x"", ""y""]) > da4 = xr.DataArray(name=""b"", data=[[3]], dims=[""x"", ""y""]) > xr.combine_nested([[da1, da2], [da3, da4]], concat_dim=[""x"", ""y""]) > > These are dataarrays with two different names. Why is this the expected > result? > > > array([[0, 1], > [2, 3]]) > Dimensions without coordinates: x, y > > That error arises because it's trying to concatenate data_vars a and b > but there are datasets that don't have a. If you set those DataArrays to > have the same name, this will work. > > da1 = xr.DataArray(name=""a"", data=[[0]], dims=[""x"", ""y""]) > da2 = xr.DataArray(name=""a"", data=[[1]], dims=[""x"", ""y""]) > da3 = xr.DataArray(name=""a"", data=[[2]], dims=[""x"", ""y""]) > da4 = xr.DataArray(name=""a"", data=[[3]], dims=[""x"", ""y""]) > > ds1 = da1.to_dataset() > ds2 = da2.to_dataset() > ds3 = da3.to_dataset() > ds4 = da4.to_dataset() > xr.combine_nested([[ds1, ds2], [ds3, ds4]], concat_dim=[""x"", ""y""]) > > > Dimensions: (x: 2, y: 2) > Dimensions without coordinates: x, y > Data variables: > a (x, y) int64 0 1 2 3 > > ping @TomNicholas > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646 https://github.com/pydata/xarray/issues/3315#issuecomment-532777471,https://api.github.com/repos/pydata/xarray/issues/3315,532777471,MDEyOklzc3VlQ29tbWVudDUzMjc3NzQ3MQ==,2448579,2019-09-18T17:07:39Z,2019-09-18T17:07:39Z,MEMBER,"This honestly makes no sense to me. ``` da1 = xr.DataArray(name=""a"", data=[[0]], dims=[""x"", ""y""]) da2 = xr.DataArray(name=""b"", data=[[1]], dims=[""x"", ""y""]) da3 = xr.DataArray(name=""a"", data=[[2]], dims=[""x"", ""y""]) da4 = xr.DataArray(name=""b"", data=[[3]], dims=[""x"", ""y""]) xr.combine_nested([[da1, da2], [da3, da4]], concat_dim=[""x"", ""y""]) ``` These are dataarrays with two different names. Why is this the expected result? ``` array([[0, 1], [2, 3]]) Dimensions without coordinates: x, y ``` That error arises because it's trying to concatenate data_vars `a` and `b` but there are datasets that don't have `a`. If you set those DataArrays to have the same name, this will work. ``` da1 = xr.DataArray(name=""a"", data=[[0]], dims=[""x"", ""y""]) da2 = xr.DataArray(name=""a"", data=[[1]], dims=[""x"", ""y""]) da3 = xr.DataArray(name=""a"", data=[[2]], dims=[""x"", ""y""]) da4 = xr.DataArray(name=""a"", data=[[3]], dims=[""x"", ""y""]) ds1 = da1.to_dataset() ds2 = da2.to_dataset() ds3 = da3.to_dataset() ds4 = da4.to_dataset() xr.combine_nested([[ds1, ds2], [ds3, ds4]], concat_dim=[""x"", ""y""]) ``` ``` Dimensions: (x: 2, y: 2) Dimensions without coordinates: x, y Data variables: a (x, y) int64 0 1 2 3 ``` ping @TomNicholas ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,494906646