home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1912094632

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1912094632 I_kwDOAMm_X85x-D-o 8231 xr.concat concatenates along dimensions that it wasn't asked to 35968931 open 0     4 2023-09-25T18:50:29Z 2024-02-14T20:30:26Z   MEMBER      

What happened?

Here are two toy datasets designed to represent sections of a dataset that has variables living on a staggered grid. This type of dataset is common in fluid modelling (it's why xGCM exists).

```python import xarray as xr

ds1 = xr.Dataset( coords={ 'x_center': ('x_center', [1, 2, 3]), 'x_outer': ('x_outer', [0.5, 1.5, 2.5, 3.5]),
}, )

ds2 = xr.Dataset( coords={ 'x_center': ('x_center', [4, 5, 6]), 'x_outer': ('x_outer', [4.5, 5.5, 6.5]),
}, ) ```

Calling xr.concat on these with dim='x_center' happily concatenates them python xr.concat([ds1, ds2], dim='x_center') <xarray.Dataset> Dimensions: (x_outer: 7, x_center: 6) Coordinates: * x_outer (x_outer) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 * x_center (x_center) int64 1 2 3 4 5 6 Data variables: *empty* but notice that the returned result has been concatenated along both x_center and x_outer.

What did you expect to happen?

I did not expect this to work. I definitely didn't expect the datasets to be concatenated along a dimension I didn't ask them to be concatenated along (i.e. x_outer).

What I expected to happen was that (as by default coords='different') both variables would be attempted to be concatenated along the x_center dimension, which would have succeeded for the x_center variable but failed for the x_outer variable. Indeed, if I name the variables differently so that they are no longer coordinate variables then that is what happens:

```python import xarray as xr

ds1 = xr.Dataset( data_vars={ 'a': ('x_center', [1, 2, 3]), 'b': ('x_outer', [0.5, 1.5, 2.5, 3.5]),
}, )

ds2 = xr.Dataset( data_vars={ 'a': ('x_center', [4, 5, 6]), 'b': ('x_outer', [4.5, 5.5, 6.5]),
}, ) python xr.concat([ds1, ds2], dim='x_center', data_vars='different') ValueError: cannot reindex or align along dimension 'x_outer' because of conflicting dimension sizes: {3, 4} ```

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

I was trying to create an example for which you would need the automatic combined concat/merge that happens within xr.combine_by_coords.

Environment

xarray 2023.8.0

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8231/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 3.572ms · About: xarray-datasette