home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 633213547

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/4089#issuecomment-633213547 https://api.github.com/repos/pydata/xarray/issues/4089 633213547 MDEyOklzc3VlQ29tbWVudDYzMzIxMzU0Nw== 56925856 2020-05-24T10:59:43Z 2020-05-24T11:00:53Z CONTRIBUTOR

The current problem is that we can't use Pandas to fully test xr.cov() or xr.corr() because once you convert the DataArrays to a series or a dataframe for testing, you can't easily index them with a dim parameter. See @r-beer 's comment here https://github.com/pydata/xarray/pull/3550#issuecomment-557895005.

As such, I think it maybe just makes sense to test a few low-dimensional cases? Eg

```python3

da_a = xr.DataArray( np.random.random((3, 21, 4)), coords={"time": pd.date_range("2000-01-01", freq="1D", periods=21)}, dims=("a", "time", "x"), )

da_b = xr.DataArray( np.random.random((3, 21, 4)), coords={"time": pd.date_range("2000-01-01", freq="1D", periods=21)}, dims=("a", "time", "x"), )

xr.cov(da_a, da_b, 'time') <xarray.DataArray (a: 3, x: 4)> array([[-0.01824046, 0.00373796, -0.00601642, -0.00108818], [ 0.00686132, -0.02680119, -0.00639433, -0.00868691], [-0.00889806, 0.02622817, -0.01022208, -0.00101257]]) Dimensions without coordinates: a, x xr.cov(da_a, da_b, 'time').sel(a=0,x=0) <xarray.DataArray ()> array(-0.01824046) da_a.sel(a=0,x=0).to_series().cov(da_b.sel(a=0,x=0).to_series()) -0.018240458880158048 ```

So, while it's easy to check that a few individual points from xr.cov() agree with the pandas implementation, it would require a loop over (a,x) in order to check all of the points for this example. Do people have thoughts about this?

I think it would also make sense to have some test cases where we don't use Pandas at all, but we specify the output manually?

```python3

da_a = xr.DataArray([[1, 2], [1, np.nan]], dims=["x", "time"]) expected = [1, np.nan] actual = xr.corr(da_a, da_a, dim='time') assert_allclose(actual, expected) ```

Does this seem like a good way forward?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  623751213
Powered by Datasette · Queries took 78.527ms · About: xarray-datasette