html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4804#issuecomment-849813462,https://api.github.com/repos/pydata/xarray/issues/4804,849813462,MDEyOklzc3VlQ29tbWVudDg0OTgxMzQ2Mg==,2448579,2021-05-27T17:33:45Z,2021-05-27T17:33:45Z,MEMBER,"Reopening for the suggestions in https://github.com/pydata/xarray/issues/4804#issuecomment-760114285 cc @AndrewWilliams3142 if you're looking for a small followup PR with potentially large impact :)","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-767138669,https://api.github.com/repos/pydata/xarray/issues/4804,767138669,MDEyOklzc3VlQ29tbWVudDc2NzEzODY2OQ==,2448579,2021-01-25T21:57:03Z,2021-01-25T21:57:03Z,MEMBER,@kathoef we'd be happy to merge a PR with some of the suggestions proposed here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-760114285,https://api.github.com/repos/pydata/xarray/issues/4804,760114285,MDEyOklzc3VlQ29tbWVudDc2MDExNDI4NQ==,5700886,2021-01-14T10:44:19Z,2021-01-14T10:44:19Z,CONTRIBUTOR,"I'd also add that https://github.com/pydata/xarray/blob/master/xarray/core/computation.py#L1320_L1330 which is essentially ```python ((x - x.mean()) * (y - y.mean())).mean() ``` is inferior to ```python (x * y).mean() - x.mean() * y.mean() ``` because it leads to Dask holding all chunks of `x` in memory (see, e.g., https://github.com/dask/dask/issues/6674 for details).","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-760025539,https://api.github.com/repos/pydata/xarray/issues/4804,760025539,MDEyOklzc3VlQ29tbWVudDc2MDAyNTUzOQ==,12237157,2021-01-14T08:44:22Z,2021-01-14T08:44:22Z,CONTRIBUTOR,"Thanks for the suggestion with xr.align. my speculation is that xs.pearson_r is a bit faster because we first write the whole function in numpy and then pass it through xr.apply_ufunc. I think therefore it only works for xr but not dask.da","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-759780514,https://api.github.com/repos/pydata/xarray/issues/4804,759780514,MDEyOklzc3VlQ29tbWVudDc1OTc4MDUxNA==,10194086,2021-01-13T22:32:47Z,2021-01-14T01:15:02Z,MEMBER,"@aaronspring I had a quick look at your version - do you have an idea why it is is faster? Does yours also work for dask arrays? * In `a, b = xr.broadcast(a, b, exclude=dim)` why can you exclude `dim`? * I think you could also use `a, b = xr.align(a, b, exclude=dim)` (`broadcast` has `join=""outer""` which fills it with `NA` which then get ignored; `align` uses `join=""inner""`) * Does your version work if the weights contain `NA`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-759795213,https://api.github.com/repos/pydata/xarray/issues/4804,759795213,MDEyOklzc3VlQ29tbWVudDc1OTc5NTIxMw==,10194086,2021-01-13T22:52:19Z,2021-01-13T22:52:19Z,MEMBER,"Another possibility is to replace https://github.com/pydata/xarray/blob/cc53a77ff0c8aaf8686f0b0bd7f75985b74e2054/xarray/core/computation.py#L1327 with `xr.dot`. However, to do so, you need to replace `NA` with `0` (and I am not sure if that's worth it). Also the `min_count` needs to be addressed (but that should not be too difficult).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-759767957,https://api.github.com/repos/pydata/xarray/issues/4804,759767957,MDEyOklzc3VlQ29tbWVudDc1OTc2Nzk1Nw==,12237157,2021-01-13T22:04:38Z,2021-01-13T22:04:38Z,CONTRIBUTOR,Your function from the notebook could also easily implement the builtin weighted function,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-759766466,https://api.github.com/repos/pydata/xarray/issues/4804,759766466,MDEyOklzc3VlQ29tbWVudDc1OTc2NjQ2Ng==,12237157,2021-01-13T22:01:49Z,2021-01-13T22:01:49Z,CONTRIBUTOR,"We implemented xr.corr as xr.pearson_r in https://xskillscore.readthedocs.io/en/stable/api/xskillscore.pearson_r.html#xskillscore.pearson_r and it’s ~30% faster than xr.corr see #4768 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941 https://github.com/pydata/xarray/issues/4804#issuecomment-759745055,https://api.github.com/repos/pydata/xarray/issues/4804,759745055,MDEyOklzc3VlQ29tbWVudDc1OTc0NTA1NQ==,10194086,2021-01-13T21:17:34Z,2021-01-13T21:17:34Z,MEMBER,"Yes `if not valid_values.all()` is not lazy. That's the same problem as #4541 and therefore #4559 can be an inspiration how to tackle this. It would be good to test if the check also makes this slower for numpy arrays? Then it could also be removed entirely. That would be counter-intuitive for me, but it seems to be faster for dask arrays... Other improvements * I am not sure if `/=` avoids a copy but if so, that's also a possibility to make it faster. * We could add a short-cut for `skipna=False` (would require adding this option) or dtypes that cannot have `NA` values as follows: ```python if skipna: # 2. Ignore the nans valid_values = da_a.notnull() & da_b.notnull() if not valid_values.all(): da_a = da_a.where(valid_values) da_b = da_b.where(valid_values) valid_count = valid_values.sum(dim) - ddof else: # shortcut for skipna=False # da_a and da_b are aligned, so the have the same dims and shape axis = da_a.get_axis_num(dim) valid_count = np.take(da_a.shape, axis).prod() - ddof ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,785329941