issue_comments
4 rows where author_association = "CONTRIBUTOR", issue = 904153867 and user = 56925856 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` · 4 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
850843957 | https://github.com/pydata/xarray/pull/5390#issuecomment-850843957 | https://api.github.com/repos/pydata/xarray/issues/5390 | MDEyOklzc3VlQ29tbWVudDg1MDg0Mzk1Nw== | AndrewILWilliams 56925856 | 2021-05-29T14:37:48Z | 2021-05-31T10:27:06Z | CONTRIBUTOR | @willirath this is cool, but I think it doesn't explain why the tests fail. Currently @dcherian, I think I've got it to work, but you need to account for the length(s) of the dimension you're calculating the correlation over. (i.e. This latest commit does this, but I'm not sure whether the added complication is worth it yet? Thoughts welcome. ```python3 def _mean(da): return (da.sum(dim=dim, skipna=True, min_count=1) / (valid_count)) dim_length = da_a.notnull().sum(dim=dim, skipna=True) def _mean_detrended_term(da): return (dim_length * da / (valid_count)) cov = _mean(da_a * da_b) - _mean_detrended_term(da_a.mean(dim=dim) * da_b.mean(dim=dim)) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867 | |
850690985 | https://github.com/pydata/xarray/pull/5390#issuecomment-850690985 | https://api.github.com/repos/pydata/xarray/issues/5390 | MDEyOklzc3VlQ29tbWVudDg1MDY5MDk4NQ== | AndrewILWilliams 56925856 | 2021-05-28T21:43:52Z | 2021-05-28T21:44:12Z | CONTRIBUTOR |
I think you'd still have to normalize the second term by |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867 | |
850556738 | https://github.com/pydata/xarray/pull/5390#issuecomment-850556738 | https://api.github.com/repos/pydata/xarray/issues/5390 | MDEyOklzc3VlQ29tbWVudDg1MDU1NjczOA== | AndrewILWilliams 56925856 | 2021-05-28T17:12:52Z | 2021-05-28T17:14:08Z | CONTRIBUTOR | @willirath this is great stuff, thanks again! So generally it looks like the graph is more efficient when doing operations of the form:
than doing
or like what I've implemented (see screenshot)? ```python3 intermediate = (X * Y) - (X.mean('time') * Y.mean('time')) intermediate.mean('time') ``` If so, it seems like the most efficient(?) way to do the computation in _cov_corr() is to combine it all into one line? I can't think of how to do this though... |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867 | |
850276619 | https://github.com/pydata/xarray/pull/5390#issuecomment-850276619 | https://api.github.com/repos/pydata/xarray/issues/5390 | MDEyOklzc3VlQ29tbWVudDg1MDI3NjYxOQ== | AndrewILWilliams 56925856 | 2021-05-28T09:15:30Z | 2021-05-28T09:17:48Z | CONTRIBUTOR | @willirath , thanks for your example notebook! I'm still trying to get my head around this a bit though. Say you have ```python3 da_a = xr.DataArray( np.array([[1, 2, 3, 4], [1, 0.1, 0.2, 0.3], [2, 3.2, 0.6, 1.8]]), dims=("space", "time"), coords=[ ("space", ["IA", "IL", "IN"]), ("time", pd.date_range("2000-01-01", freq="1D", periods=4)), ], ).chunk() da_b = xr.DataArray( np.array([[0.2, 0.4, 0.6, 2], [15, 10, 5, 1], [1, 3.2, np.nan, 1.8]]), dims=("space", "time"), coords=[ ("space", ["IA", "IL", "IN"]), ("time", pd.date_range("2000-01-01", freq="1D", periods=4)), ], ).chunk() ``` The original computation in Whereas my alteration now has a graph more like this:
Am I correct in thinking that this is a 'better' computational graph? Because the original chunks are not passed onto later points in the computation? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1