home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1588461863

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1588461863 I_kwDOAMm_X85ergEn 7539 Concat doesn't concatenate dimension coordinates along new dims 35968931 open 0     4 2023-02-16T22:32:33Z 2023-02-21T19:07:48Z   MEMBER      

What is your issue?

xr.concat doesn't concatenate dimension coordinates along new dimensions, which leads to pretty unintuitive behavior.

Take this example (motivated by https://github.com/pydata/xarray/discussions/7532#discussioncomment-4988792) python segments = [] for i in range(2): time = np.sort(np.random.random(4)) da = xr.DataArray( np.random.randn(4,2), dims=["time", "cols"], coords=dict(time=('time', time), cols=["col1", "col2"]), ) segments.append(da) python In [86]: segments Out[86]: [<xarray.DataArray (time: 4, cols: 2)> array([[-0.61199576, -0.9012078 ], [-0.54187577, 1.30509994], [-3.53720471, 0.97607797], [ 0.2593455 , 0.95920031]]) Coordinates: * time (time) float64 0.1048 0.168 0.869 0.9432 * cols (cols) <U4 'col1' 'col2', <xarray.DataArray (time: 4, cols: 2)> array([[ 0.90266408, -0.54294821], [-1.09087103, -0.17484417], [-0.21679558, -0.57377412], [ 0.07570151, 0.27433728]]) Coordinates: * time (time) float64 0.03627 0.09754 0.2434 0.592 * cols (cols) <U4 'col1' 'col2'] ```python In [85]: xr.concat(segments, dim='new') Out[85]: <xarray.DataArray (new: 2, time: 8, cols: 2)> array([[[ nan, nan], [ nan, nan], [-0.61199576, -0.9012078 ], [-0.54187577, 1.30509994], [ nan, nan], [ nan, nan], [-3.53720471, 0.97607797], [ 0.2593455 , 0.95920031]],

   [[ 0.90266408, -0.54294821],
    [-1.09087103, -0.17484417],
    [        nan,         nan],
    [        nan,         nan],
    [-0.21679558, -0.57377412],
    [ 0.07570151,  0.27433728],
    [        nan,         nan],
    [        nan,         nan]]])

Coordinates: * time (time) float64 0.03627 0.09754 0.1048 0.168 ... 0.592 0.869 0.9432 * cols (cols) <U4 'col1' 'col2' Dimensions without coordinates: new ```

I would have expected to get a result of size {new: 2, time: 4, cols: 2}. That would be intuitive, because the default is coords='different', and that would be the result of concatenating each time coordinate (which have different values) and just propagating the cols coordinate (as they have the same values).

Instead what happened is that xr.concat treats the dimension coordinates as indexes to align, and defaults to an outer join. This auto-alignment behaviour has been discussed at length before, I'm just trying to point out another place in which its problematic.

This is kind of briefly mentioned in the concat docstring under coords='all': “all”: All coordinate variables will be concatenated, except those corresponding to other dimensions. but it's not even mentioned under coords='different'

I don't really know what I would prefer to happen with the coordinates. I guess to have created a time coordinate of size {new: 2, time: 4, cols: 2}, but then I don't know what that implies for the underlying index. @benbovy do you have any thoughts?

At the very least we should make this a lot clearer in the docs.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7539/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.948ms · About: xarray-datasette