home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 2116618415

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
2116618415 PR_kwDOAMm_X85l7Cdb 8698 New alignment option: `join='strict'` 45271239 closed 0     5 2024-02-03T17:58:43Z 2024-02-25T09:09:37Z 2024-02-25T09:09:37Z CONTRIBUTOR   0 pydata/xarray/pulls/8698

Title: New alignment option: join='strict'

  • [ ] Closes #8231
  • [x] Closes #6806
  • [x] Tests added
  • [ ] User visible changes (including notable bug fixes) are documented in whats-new.rst
    • [x] What's new entry
    • [x] Refer to PR ID (cannot be done before the PR has been created)
  • [x] New functions/methods are listed in api.rst
    • No new functions/methods.

Motive

This PR is motivated by solving of the following issues:

  • xr.concat concatenates along dimensions that it wasn't asked to #8231
    • New alignment option: "exact" without broadcasting OR Turn off automatic broadcasting #6806

The current PR does not solve the unexpected issue described in #8231 without a change in user-code. Indeed, in the tests written, it is shown that to get the said expected behavior, the user would have to use the new join='strict' mode suggested in #6806 for the concatenation operation. Only in that case, the uniqueness of the indexed dimensions' names will be checked, re-using the same logic that was already applied for join='override' in Aligner.find_matching_indexes

This may not be enough to fix #8231. If that isn't, I can split the PR into two, first one for adding the join='strict' for #6806 and later on one for #8321.

Technical Details

I try to detail here my thought process. Please correct me if there is anything wrong. This is my first time digging into this core logic!

Here is my understanding of the terms:

  • An indexed dimension is attached to a coordinate variable
  • An unindexed dimension is not attached to a coordinate variable ("Dimensions without coordinates")

Input data for Scenario 1, tested in test_concat_join_coordinate_variables_non_asked_dims

```python ds1 = Dataset( coords={ "x_center": ("x_center", [1, 2, 3]), "x_outer": ("x_outer", [0.5, 1.5, 2.5, 3.5]), }, )

ds2 = Dataset(
    coords={
        "x_center": ("x_center", [4, 5, 6]),
        "x_outer": ("x_outer", [4.5, 5.5, 6.5]),
    },
)

```

Input data for Scenario 2, tested in test_concat_join_non_coordinate_variables

```python ds1 = Dataset( data_vars={ "a": ("x_center", [1, 2, 3]), "b": ("x_outer", [0.5, 1.5, 2.5, 3.5]), }, )

ds2 = Dataset(
    data_vars={
        "a": ("x_center", [4, 5, 6]),
        "b": ("x_outer", [4.5, 5.5, 6.5]),
    },
)

```

Logic for non-indexed dimensions logic was working "as expected", as it relies on Aligner.assert_unindexed_dim_sizes_equal, checking that unindexed dimension sizes are equal as its name suggests. (Scenario 1)

However, the logic for indexed dimensions was surprising as such an expected check on dimensions' sizes was not performed. A check exists in Aligner.find_matching_indexes but was only applied to join='override'. Applying it for join='strict' too is suggested in this Pull Request.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8698/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 pull

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.776ms · About: xarray-datasette