home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 777670351

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
777670351 MDU6SXNzdWU3Nzc2NzAzNTE= 4756 feat: reindex multiple DataArrays 4711805 open 0     1 2021-01-03T16:23:01Z 2021-01-03T19:05:03Z   CONTRIBUTOR      

When e.g. creating a Dataset from multiple DataArrays that are supposed to share the same grid, but are not exactly aligned (as is often the case with floating point coordinates), we usually end up with undesirable NaNs inserted in the data set. For instance, consider the following data arrays that are not exactly aligned: ```python import xarray as xr

da1 = xr.DataArray([[0, 1, 2], [3, 4, 5], [6, 7, 8]], coords=[[0, 1, 2], [0, 1, 2]], dims=['x', 'y']).rename('da1') da2 = xr.DataArray([[0, 1, 2], [3, 4, 5], [6, 7, 8]], coords=[[1.1, 2.1, 3.1], [1.1, 2.1, 3.1]], dims=['x', 'y']).rename('da2') da1.plot.imshow() da2.plot.imshow() ![image](https://user-images.githubusercontent.com/4711805/103482830-542bbe80-4de3-11eb-814b-bb1f705967c4.png) ![image](https://user-images.githubusercontent.com/4711805/103482836-61e14400-4de3-11eb-804b-f549c2551562.png) They show gaps when combined in a data set:python ds = xr.Dataset({'da1': da1, 'da2': da2}) ds['da1'].plot.imshow() ds['da2'].plot.imshow() ![image](https://user-images.githubusercontent.com/4711805/103482959-3f9bf600-4de4-11eb-9513-900319cb485a.png) ![image](https://user-images.githubusercontent.com/4711805/103482966-47f43100-4de4-11eb-853b-2b44f7bc8d7f.png) I think this is a frequent enough situation that we would like a function to re-align all the data arrays together. There is a `reindex_like` method, which accepts a tolerance, but calling it successively on every data array, like so:python da1r = da1.reindex_like(da2, method='nearest', tolerance=0.2) da2r = da2.reindex_like(da1r, method='nearest', tolerance=0.2) ``` would result in the intersection of the coordinates, rather than their union. What I would like is a function like the following:

```python import numpy as np from functools import reduce

def reindex_all(arrays, dims, tolerance): coords = {} for dim in dims: coord = reduce(np.union1d, [array[dim] for array in arrays[1:]], arrays[0][dim]) diff = coord[:-1] - coord[1:] keep = np.abs(diff) > tolerance coords[dim] = np.append(coord[:-1][keep], coord[-1]) reindexed = [array.reindex(coords, method='nearest', tolerance=tolerance) for array in arrays] return reindexed

da1r, da2r = reindex_all([da1, da2], ['x', 'y'], 0.2) dsr = xr.Dataset({'da1': da1r, 'da2': da2r}) dsr['da1'].plot.imshow() dsr['da2'].plot.imshow() ``` I have not found something equivalent. If you think this is worth it, I could try and send a PR to implement such a feature.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4756/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 1 row from issue in issue_comments
Powered by Datasette · Queries took 0.714ms · About: xarray-datasette