home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 334833619

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
334833619 MDU6SXNzdWUzMzQ4MzM2MTk= 2245 Attributes of Dataset coordinates are dropped/replaced when adding a DataArray 374821 open 0     4 2018-06-22T10:47:53Z 2022-02-03T16:18:14Z   NONE      

Problem description

Attributes of Dataset coordinates are dropped or replaced when adding a DataArray with dimensions or coordinates that already exist in the Dataset. In addition the order of the Dataset's coordinates can change by adding a DataArray.

Expected Behaviour

Attributes of Dataset coordinates should not be altered by adding a DataArray to the Dataset, and the order of existing coordinates should be preserved.

More details and code examples

The following code shows the behaviour by adding new data variables to a Dataset using a tuple, a DataArray (dimension without coordinates), and a Variable.

```python import numpy as np import xarray as xr

ds = xr.Dataset( coords={ 'x': ('x', np.arange(10, 20), {'meta': 'foo'}), 'y': ('y', np.arange(20, 30), {'meta': 'bar'}), 'z': ('z', np.arange(30, 40), {'meta': 'baz'})})

print(ds, end='\n\n') ds.info()

print('\n\n====\n')

ds['a'] = 'x', np.arange(10) ds['b'] = xr.DataArray(np.arange(10), dims='y') ds['c'] = xr.Variable('z', np.arange(10))

print(ds, end='\n\n') ds.info() ```

Output ``` <xarray.Dataset> Dimensions: (x: 10, y: 10, z: 10) Coordinates: * x (x) int64 10 11 12 13 14 15 16 17 18 19 * y (y) int64 20 21 22 23 24 25 26 27 28 29 * z (z) int64 30 31 32 33 34 35 36 37 38 39 Data variables: *empty* xarray.Dataset { dimensions: x = 10 ; y = 10 ; z = 10 ; variables: int64 x(x) ; x:meta = foo ; int64 y(y) ; y:meta = bar ; int64 z(z) ; z:meta = baz ; // global attributes: } ==== <xarray.Dataset> Dimensions: (x: 10, y: 10, z: 10) Coordinates: * y (y) int64 20 21 22 23 24 25 26 27 28 29 * x (x) int64 10 11 12 13 14 15 16 17 18 19 * z (z) int64 30 31 32 33 34 35 36 37 38 39 Data variables: a (x) int64 0 1 2 3 4 5 6 7 8 9 b (y) int64 0 1 2 3 4 5 6 7 8 9 c (z) int64 0 1 2 3 4 5 6 7 8 9 xarray.Dataset { dimensions: x = 10 ; y = 10 ; z = 10 ; variables: int64 y(y) ; int64 x(x) ; x:meta = foo ; int64 z(z) ; z:meta = baz ; int64 a(x) ; int64 b(y) ; int64 c(z) ; // global attributes: ```

The output shows that the attributes and the order of the Dataset's coordinates are preserved (as expected) when adding data variables using a tuple or a Variable, but when using a DataArray instead the attributes are dropped for the related coordinates, and the ordering of the Dataset's coordinates is changed.

When adding DataArrays with coordinates to the Dataset, the attributes of the affected Dataset coordinates are replaced with the attributes of the DataArray's coordinates:

```python d = xr.DataArray( np.arange(10), coords=[('x', np.arange(10, 20), {'breakfast': 'eggs'})])

e = xr.DataArray( np.arange(10), coords=[('z', np.arange(40, 50), {'breakfast': 'spam'})])

print('d.x =', d.x, end='\n\n') print('e.z =', e.z, end='\n\n')

ds['d'] = d ds['e'] = e

print(ds, end='\n\n') ds.info() ```

Output ``` d.x = <xarray.DataArray 'x' (x: 10)> array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]) Coordinates: * x (x) int64 10 11 12 13 14 15 16 17 18 19 Attributes: breakfast: eggs e.z = <xarray.DataArray 'z' (z: 10)> array([40, 41, 42, 43, 44, 45, 46, 47, 48, 49]) Coordinates: * z (z) int64 40 41 42 43 44 45 46 47 48 49 Attributes: breakfast: spam <xarray.Dataset> Dimensions: (x: 10, y: 10, z: 10) Coordinates: * z (z) int64 30 31 32 33 34 35 36 37 38 39 * y (y) int64 20 21 22 23 24 25 26 27 28 29 * x (x) int64 10 11 12 13 14 15 16 17 18 19 Data variables: a (x) int64 0 1 2 3 4 5 6 7 8 9 b (y) int64 0 1 2 3 4 5 6 7 8 9 c (z) int64 0 1 2 3 4 5 6 7 8 9 d (x) int64 0 1 2 3 4 5 6 7 8 9 e (z) float64 nan nan nan nan nan nan nan nan nan nan xarray.Dataset { dimensions: x = 10 ; y = 10 ; z = 10 ; variables: int64 z(z) ; z:breakfast = spam ; int64 y(y) ; int64 x(x) ; x:breakfast = eggs ; int64 a(x) ; int64 b(y) ; int64 c(z) ; int64 d(x) ; float64 e(z) ; // global attributes: ```

This even happens for the DataArray e in the example above which has a common dimension 'z' with the Dataset ds, but different coordinate values. In this case the data and coordinate values are handled as one would expect: The ds.e array is filled with NaNs (because the coordinate values do not match), and the ds.z coordinate values are not replaced by the DataArray's e.z coordinate values. But the attributes of the Dataset's coordinates (ds.z.attrs) are still replaced by the attributes of the DataArray's coordinates (e.z.attrs).

Output of xr.show_versions()

``` INSTALLED VERSIONS ------------------ commit: None python: 3.6.5.final.0 python-bits: 64 OS: Linux OS-release: 4.17.2-1-ARCH machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.7 pandas: 0.23.0 numpy: 1.14.3 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: 0.17.5 distributed: 1.21.8 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.8.1 setuptools: 39.1.0 pip: 10.0.1 conda: None pytest: 3.5.1 IPython: 6.4.0 sphinx: 1.7.4 ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2245/reactions",
    "total_count": 4,
    "+1": 4,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 77.426ms · About: xarray-datasette