home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 850843957

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/5390#issuecomment-850843957 https://api.github.com/repos/pydata/xarray/issues/5390 850843957 MDEyOklzc3VlQ29tbWVudDg1MDg0Mzk1Nw== 56925856 2021-05-29T14:37:48Z 2021-05-31T10:27:06Z CONTRIBUTOR

@willirath this is cool, but I think it doesn't explain why the tests fail. Currently da_a.mean() and the da_b.mean() calls do know about each other's missing data! That's what we're doing in these lines.

@dcherian, I think I've got it to work, but you need to account for the length(s) of the dimension you're calculating the correlation over. (i.e. (da-da.mean('time')).sum('time') is not the same as da.sum('time') - da.mean('time') because you should actually do da.sum('time') - da.mean('time')*length_of_time_dim)

This latest commit does this, but I'm not sure whether the added complication is worth it yet? Thoughts welcome. ```python3 def _mean(da): return (da.sum(dim=dim, skipna=True, min_count=1) / (valid_count))

dim_length = da_a.notnull().sum(dim=dim, skipna=True) def _mean_detrended_term(da): return (dim_length * da / (valid_count))

cov = _mean(da_a * da_b) - _mean_detrended_term(da_a.mean(dim=dim) * da_b.mean(dim=dim)) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  904153867
Powered by Datasette · Queries took 0.647ms · About: xarray-datasette