github: issue_comments: 3 rows where issue = 904153867 and user = 5700886 sorted by updated

3 rows where issue = 904153867 and user = 5700886 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
850820173	https://github.com/pydata/xarray/pull/5390#issuecomment-850820173	https://api.github.com/repos/pydata/xarray/issues/5390	MDEyOklzc3VlQ29tbWVudDg1MDgyMDE3Mw==	willirath 5700886	2021-05-29T11:51:50Z	2021-05-29T11:51:59Z	CONTRIBUTOR	I think the problem with `cov = _mean(da_a * da_b) - da_a.mean(dim=dim) * da_b.mean(dim=dim)` is that the `da_a.mean()` and the `da_b.mean()` calls don't know about each other's missing data.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867
850819741	https://github.com/pydata/xarray/pull/5390#issuecomment-850819741	https://api.github.com/repos/pydata/xarray/issues/5390	MDEyOklzc3VlQ29tbWVudDg1MDgxOTc0MQ==	willirath 5700886	2021-05-29T11:48:02Z	2021-05-29T11:48:02Z	CONTRIBUTOR	Shouldn't the following do? `python cov = ( (da_a * da_b).mean(dim) - ( da_a.where(da_b.notnull()).mean(dim) * da_b.where(da_a.notnull()).mean(dim) ) )` (See here: https://nbviewer.jupyter.org/gist/willirath/cfaa8fb1b53fcb8dcb05ddde839c794c )	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867
850542572	https://github.com/pydata/xarray/pull/5390#issuecomment-850542572	https://api.github.com/repos/pydata/xarray/issues/5390	MDEyOklzc3VlQ29tbWVudDg1MDU0MjU3Mg==	willirath 5700886	2021-05-28T16:45:55Z	2021-05-28T16:45:55Z	CONTRIBUTOR	@AndrewWilliams3142 @dcherian Looks like I broke the first Gist. :( Your Example above does not quite get there, because the `xr.DataArray(np...).chunk()` just leads to one chunk per data array. Here's a Gist that explains the idea for the correlations: https://nbviewer.jupyter.org/gist/willirath/c5c5274f31c98e8452548e8571158803 With ```python X = xr.DataArray( darr.random.normal(size=array_size, chunks=chunk_size), dims=("t", "y", "x"), name="X", ) Y = xr.DataArray( darr.random.normal(size=array_size, chunks=chunk_size), dims=("t", "y", "x"), name="Y", ) `the "bad" / explicit way of calculating the correlation`python corr_exp = ((X - X.mean("t")) * (Y - Y.mean("t"))).mean("t") ``` leads to a graph like this: Dask won't release any of the tasks defining `X` and `Y` until the marked `sub`straction tasks are done. The "good" / aggregating way of calculting the correlation `python corr_agg = (X * Y).mean("t") - X.mean("t") * Y.mean("t")` has the following graph where the marked `mul`tiplication and `mean_chunk` tasks are acting on only pairs of chunks and individual chunks and then release the original chunks of `X` and `Y`. This graph can be evaluated with a much smaller memory foot print than the other one. (It's not certain that this is always leading to lower memory use, however. But this is a different issue ...)	{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 1 }	Improvements to lazy behaviour of `xr.cov()` and `xr.corr()` 904153867

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);