home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1249910951

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/7045#issuecomment-1249910951 https://api.github.com/repos/pydata/xarray/issues/7045 1249910951 IC_kwDOAMm_X85KgCCn 1217238 2022-09-16T22:26:36Z 2022-09-16T22:26:36Z MEMBER

As a concrete example, suppose we have two datasets: 1. Hourly predictions for 10 days 2. Daily observations for a month.

```python import numpy as np import pandas as pd import xarray

predictions = xarray.DataArray( np.random.RandomState(0).randn(24*10), {'time': pd.date_range('2022-01-01', '2022-01-11', freq='1h', closed='left')}, ) observations = xarray.DataArray( np.random.RandomState(1).randn(31), {'time': pd.date_range('2022-01-01', '2022-01-31', freq='24h')}, ) ```

Today, if you compare these datasets, they automatically align: ```

predictions - observations <xarray.DataArray (time: 10)> array([ 0.13970698, 2.88151104, -1.0857261 , 2.21236931, -0.85490761, 2.67796423, 0.63833301, 1.94923669, -0.35832191, 0.23234996]) Coordinates: * time (time) datetime64[ns] 2022-01-01 2022-01-02 ... 2022-01-10 ```

With this proposed change, you would get an error, e.g., something like: ```

predictions - observations ValueError: xarray objects are not aligned along dimension 'time':
array(['2022-01-01T00:00:00.000000000', '2022-01-02T00:00:00.000000000', '2022-01-03T00:00:00.000000000', '2022-01-04T00:00:00.000000000', '2022-01-05T00:00:00.000000000', '2022-01-06T00:00:00.000000000', '2022-01-07T00:00:00.000000000', '2022-01-08T00:00:00.000000000', '2022-01-09T00:00:00.000000000', '2022-01-10T00:00:00.000000000', '2022-01-11T00:00:00.000000000', '2022-01-12T00:00:00.000000000', '2022-01-13T00:00:00.000000000', '2022-01-14T00:00:00.000000000', '2022-01-15T00:00:00.000000000', '2022-01-16T00:00:00.000000000', '2022-01-17T00:00:00.000000000', '2022-01-18T00:00:00.000000000', '2022-01-19T00:00:00.000000000', '2022-01-20T00:00:00.000000000', '2022-01-21T00:00:00.000000000', '2022-01-22T00:00:00.000000000', '2022-01-23T00:00:00.000000000', '2022-01-24T00:00:00.000000000', '2022-01-25T00:00:00.000000000', '2022-01-26T00:00:00.000000000', '2022-01-27T00:00:00.000000000', '2022-01-28T00:00:00.000000000', '2022-01-29T00:00:00.000000000', '2022-01-30T00:00:00.000000000', '2022-01-31T00:00:00.000000000'], dtype='datetime64[ns]') vs array(['2022-01-01T00:00:00.000000000', '2022-01-01T01:00:00.000000000', '2022-01-01T02:00:00.000000000', ..., '2022-01-10T21:00:00.000000000', '2022-01-10T22:00:00.000000000', '2022-01-10T23:00:00.000000000'], dtype='datetime64[ns]') ```

Instead, you would need to manually align these objects, e.g., with xarray.align, reindex_like() or interp_like(), e.g., ```

predictions, observations = xarray.align(predictions, observations) or observations = observations.reindex_like(predictions) or predictions = predictions.interp_like(observations) ```

To (partially) simulate the effect of this change on a codebase today, you could write xarray.set_options(arithmetic_join='exact') -- but presmably it would also make sense to change Xarray's other alignment code (e.g., in concat and merge).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1376109308
Powered by Datasette · Queries took 0.782ms · About: xarray-datasette