home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 584975960

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3315#issuecomment-584975960 https://api.github.com/repos/pydata/xarray/issues/3315 584975960 MDEyOklzc3VlQ29tbWVudDU4NDk3NTk2MA== 10554254 2020-02-12T01:46:00Z 2020-02-12T01:46:00Z NONE

Few observations after looking at the default flags for concat:

python xr.concat( objs, dim, data_vars='all', coords='different', compat='equals', positions=None, fill_value=<NA>, join='outer', )

The description of compat='equals' indicates combining DataArrays with different names should fail: 'equals': all values and dimensions must be the same. (though I am not entirely sure what is meant by values... I assume this perhaps generically means keys?)

Another option is compat='identical' which is described as: 'identical': all values, dimensions and attributes must be the same. Using this flag will cause the operation to fail, as one would expect from the description...

```python objs = [xr.DataArray([0], dims='x', name='a'), xr.DataArray([1], dims='x', name='b')]

xr.concat(objs, dim='x', compat='identical') ```

python ValueError: array names not identical

... and is the case for concat on Datasets, as previously shown by @TomNicholas

``` objs = [xr.Dataset({'a': ('x', [0])}), xr.Dataset({'b': ('x', [0])})]

xr.concat(objs, dim='x') ```

python ValueError: 'a' is not present in all datasets.

However, 'identical': all values, dimensions and **attributes** must be the same. doesn't quite seem to be the case for DataArrays, as

```python objs = [xr.DataArray([0], dims='x', name='a', attrs={'foo':1}), xr.DataArray([1], dims='x', name='a', attrs={'bar':2})]

xr.concat(objs, dim='x', compat='identical') ``` succeeds with

python <xarray.DataArray 'a' (x: 2)> array([0, 1]) Dimensions without coordinates: x Attributes: foo: 1

but again fails on Datasets, as one would expect from the description.

```python ds1 = xr.Dataset({'a': ('x', [0])}) ds1.attrs['foo'] = 'example attribute'

ds2 = xr.Dataset({'a': ('x', [1])}) ds2.attrs['bar'] = 'example attribute'

objs = [ds1,ds2] xr.concat(objs, dim='x',compat='identical') ```

python ValueError: Dataset global attributes not equal.

Also had a look at compat='override', which will override an attrs inconsistency but not a naming one when applied to Datasets. Works as expected on DataArrays. It is described as 'override': skip comparing and pick variable from first dataset.

Potential resolutions:

  1. 'identical' should raise an error when attributes are not the same for DataArrays

  2. 'equals' should raise an error when DataArray names are not identical (unless one is None, which works with Datasets and seems fine to be replaced)

  3. 'override' should override naming inconsistencies when combining DataSets.

Final thought: perhaps promoting to Dataset when all requirements are met for a DataArray to be considered as such, might simplify keeping operations and checks consistent?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  494906646
Powered by Datasette · Queries took 0.603ms · About: xarray-datasette