home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 633286352

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/4089#issuecomment-633286352 https://api.github.com/repos/pydata/xarray/issues/4089 633286352 MDEyOklzc3VlQ29tbWVudDYzMzI4NjM1Mg== 56925856 2020-05-24T19:49:30Z 2020-05-24T19:56:24Z CONTRIBUTOR

One problem I came across here is that pandas automatically ignores 'np.nan' values in any corr or cov calculation. This is hard-coded into the package and there's no skipna=False option sadly, so what I've done in the tests is to use the numpy implementation which pandas is built on (see, for example here).

Current tests implemented are (in pseudocode...): - [x] assert_allclose(xr.cov(a, b) / (a.std() * b.std()), xr.corr(a, b)) - [x] assert_allclose(xr.cov(a,a)*(N-1), ((a - a.mean())**2).sum()) - [x] For the example in my previous comment, I now have a loop over all values of (a,x) to reconstruct the covariance / correlation matrix, and check it with an assert_allclose(...). - [x] Add more test arrays, with/without np.nans -- done

@keewis I tried reading the Hypothesis docs and got a bit overwhelmed, so I've stuck with example-based tests for now.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  623751213
Powered by Datasette · Queries took 1.191ms · About: xarray-datasette