issue_comments: 417816234

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/1115#issuecomment-417816234	https://api.github.com/repos/pydata/xarray/issues/1115	417816234	MDEyOklzc3VlQ29tbWVudDQxNzgxNjIzNA==	1217238	2018-08-31T23:55:06Z	2018-08-31T23:55:06Z	MEMBER	I tend to view the second case as a generalization of the first case. I would also hesitate to implement the `n x m` array -> `m x m` correlation matrix version because xarray doesn't handle repeated dimensions well. I think the basic implementation of this looks quite similar to what I wrote here for calculating the Pearson correlation as a NumPy gufunc: http://xarray.pydata.org/en/stable/dask.html#automatic-parallelization The main difference is that we might naturally want to support summing over multiple dimensions at once via the `dim` argument, e.g., something like: ```python untested! def covariance(x, y, dim=None): return xarray.dot(x - x.mean(dim), y - y.mean(dim), dim=dim) def corrrelation(x, y, dim=None): # dim should default to the intersection of x.dims and y.dims return covariance(x, y, dim) / (x.std(dim) * y.std(dim)) ``` If you want to achieve the equivalent of `np.corr` on an array with dimensions `('n', 'm')` with this, you just write something like `correlation(x, x.rename({'m': 'm2'}), dim='n')`.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		188996339

untested!