html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1115#issuecomment-589154086,https://api.github.com/repos/pydata/xarray/issues/1115,589154086,MDEyOklzc3VlQ29tbWVudDU4OTE1NDA4Ng==,5635139,2020-02-20T16:00:56Z,2020-02-20T16:00:56Z,MEMBER,"@r-beer I checked back on this and realized I didn't reply to your question: yes re completing #2652, if you're up for giving this a push","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-555564450,https://api.github.com/repos/pydata/xarray/issues/1115,555564450,MDEyOklzc3VlQ29tbWVudDU1NTU2NDQ1MA==,5635139,2019-11-19T15:39:17Z,2019-11-19T15:39:17Z,MEMBER,@r-beer would be great to finish this off! I think this would be a popular feature. You could take @hrishikeshac 's code (which is close!) and make the final changes.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-546176175,https://api.github.com/repos/pydata/xarray/issues/1115,546176175,MDEyOklzc3VlQ29tbWVudDU0NjE3NjE3NQ==,5635139,2019-10-25T02:38:40Z,2019-10-25T02:38:40Z,MEMBER,"Would be great to get this in, if anyone wants to have a go. A small, focused, PR would be a good start.
In the meantime you can use one of the solutions above...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-451589152,https://api.github.com/repos/pydata/xarray/issues/1115,451589152,MDEyOklzc3VlQ29tbWVudDQ1MTU4OTE1Mg==,5635139,2019-01-04T22:35:15Z,2019-01-04T22:35:15Z,MEMBER,"@hrishikeshac that looks great! Well done for getting an MVP running.
Do you want to do a PR from this? Should be v close from here.
Others can comment from there. I'd suggest we get something close to this in and iterate from there. How abstract do we want the dimensions to be (i.e. currently we can only pass one dimension in, which is fine, but potentially we could enable multiple).
One nit - no need to use `np.sum` - that may cause issues with dask arrays - `.sum` will work fine","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-445388428,https://api.github.com/repos/pydata/xarray/issues/1115,445388428,MDEyOklzc3VlQ29tbWVudDQ0NTM4ODQyOA==,5635139,2018-12-07T22:42:57Z,2018-12-07T22:42:57Z,MEMBER,"Yes for useful, but not sure whether they should be on the same method. They're also fairly easy for a user to construct (call correlation on a `.shift` copy of the array).
And increments are easy to build on! I'm the worst offender, but don't let completeness get in the way of incremental improvement
(OK, I'll go and finish the `fill_value` branch...)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-445386281,https://api.github.com/repos/pydata/xarray/issues/1115,445386281,MDEyOklzc3VlQ29tbWVudDQ0NTM4NjI4MQ==,2448579,2018-12-07T22:32:50Z,2018-12-07T22:32:50Z,MEMBER,I think lagged correlations would be a useful feature.,"{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-442858136,https://api.github.com/repos/pydata/xarray/issues/1115,442858136,MDEyOklzc3VlQ29tbWVudDQ0Mjg1ODEzNg==,1197350,2018-11-29T14:43:13Z,2018-11-29T14:43:13Z,MEMBER,Hey @hrishikeshac -- any progress on this? Need any help / advice from xarray devs?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-438370603,https://api.github.com/repos/pydata/xarray/issues/1115,438370603,MDEyOklzc3VlQ29tbWVudDQzODM3MDYwMw==,5635139,2018-11-13T17:51:56Z,2018-11-13T17:51:56Z,MEMBER,"And one that handles `NaN`s:
```python
# untested!
def covariance(x, y, dim=None):
valid_values = x.notnull() & y.notnull()
valid_count = valid_values.sum(dim)
demeaned_x = (x - x.mean(dim)).fillna(0)
demeaned_y = (y - y.mean(dim)).fillna(0)
return xr.dot(demeaned_x, demeaned_y, dims=dim) / valid_count
def correlation(x, y, dim=None):
# dim should default to the intersection of x.dims and y.dims
return covariance(x, y, dim) / (x.std(dim) * y.std(dim))
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-436784481,https://api.github.com/repos/pydata/xarray/issues/1115,436784481,MDEyOklzc3VlQ29tbWVudDQzNjc4NDQ4MQ==,5635139,2018-11-07T21:31:12Z,2018-11-07T21:31:18Z,MEMBER,"For posterity, I made a small adjustment to @shoyer 's draft:
```python
# untested!
def covariance(x, y, dim=None):
# need to ensure the dim lengths are the same - i.e. no auto-aligning
# could use count-1 for sample
return xr.dot(x - x.mean(dim), y - y.mean(dim), dims=dim) / x.count(dim)
def correlation(x, y, dim=None):
# dim should default to the intersection of x.dims and y.dims
return covariance(x, y, dim) / (x.std(dim) * y.std(dim))
```","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-419519217,https://api.github.com/repos/pydata/xarray/issues/1115,419519217,MDEyOklzc3VlQ29tbWVudDQxOTUxOTIxNw==,5635139,2018-09-07T17:59:55Z,2018-09-07T17:59:55Z,MEMBER,Great! Ping me / the issues with any questions at all!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-418530212,https://api.github.com/repos/pydata/xarray/issues/1115,418530212,MDEyOklzc3VlQ29tbWVudDQxODUzMDIxMg==,5635139,2018-09-04T21:52:22Z,2018-09-04T21:52:22Z,MEMBER,"@hrishikeshac if you'd like to contribute, we can help you along - xarray is a v welcoming project!
And from `mvstats` it looks like you're already up to speed
Let us know","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-417816234,https://api.github.com/repos/pydata/xarray/issues/1115,417816234,MDEyOklzc3VlQ29tbWVudDQxNzgxNjIzNA==,1217238,2018-08-31T23:55:06Z,2018-08-31T23:55:06Z,MEMBER,"I tend to view the second case as a generalization of the first case. I would also hesitate to implement the `n x m` array -> `m x m` correlation matrix version because xarray doesn't handle repeated dimensions well.
I think the basic implementation of this looks quite similar to what I wrote here for calculating the Pearson correlation as a NumPy gufunc:
http://xarray.pydata.org/en/stable/dask.html#automatic-parallelization
The main difference is that we might naturally want to support summing over multiple dimensions at once via the `dim` argument, e.g., something like:
```python
# untested!
def covariance(x, y, dim=None):
return xarray.dot(x - x.mean(dim), y - y.mean(dim), dim=dim)
def corrrelation(x, y, dim=None):
# dim should default to the intersection of x.dims and y.dims
return covariance(x, y, dim) / (x.std(dim) * y.std(dim))
```
If you want to achieve the equivalent of `np.corr` on an array with dimensions `('n', 'm')` with this, you just write something like `correlation(x, x.rename({'m': 'm2'}), dim='n')`.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-417802624,https://api.github.com/repos/pydata/xarray/issues/1115,417802624,MDEyOklzc3VlQ29tbWVudDQxNzgwMjYyNA==,5635139,2018-08-31T22:14:19Z,2018-08-31T22:14:19Z,MEMBER,"I'm up for adding `.corr` to xarray
What do want this to look like? It's a bit different from most xarray functions, which either return the same shape or reduce one dimension.
- The basic case here would take a `n x m` array and return an `m x m` correlation matrix. We could easily wrap https://docs.scipy.org/doc/numpy/reference/generated/numpy.corrcoef.html
- Another case would be take two similarly sized arrays (with the option of broadcasting) and return an array with one dimension reduced. For example `200 x 10` and `200`, return a `10` array.
- I need to think about how those extrapolate to multiple dimensions
Should I start with the first case and then we can expand as needed?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-266525792,https://api.github.com/repos/pydata/xarray/issues/1115,266525792,MDEyOklzc3VlQ29tbWVudDI2NjUyNTc5Mg==,10050469,2016-12-12T19:23:48Z,2016-12-12T19:25:29Z,MEMBER,"I'll chime in here to ask a usage question: what is the recommended way to compute correlation maps with xarray? I.e. I have a dataarray of dims ``(time, lat, lon)`` and I'd like to correlate every single grid point with a timeseries of dim ``(time)`` to get a correlation map of dim ``(lat, lon)``. My current strategy is a wonderfully unpythonic double loop over lons and lats, and I wonder if there's better way?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-260387059,https://api.github.com/repos/pydata/xarray/issues/1115,260387059,MDEyOklzc3VlQ29tbWVudDI2MDM4NzA1OQ==,1197350,2016-11-14T16:37:05Z,2016-11-14T16:37:05Z,MEMBER,"To be clear, I am not say that this does not belong in xarray.
I'm saying that we lack clear general guidelines for how to determine whether a particular function belongs in xarray. The criterion of a ""pretty fundamental operation for working with data"" is a good starting point. I would add:
- used across a wide range of scientific disciplines
- clear, unambiguous / uncontroversial definition
- numpy implementation already exists
`corr` meets all of these criteria. Many others (e.g. interpolation, convolution, curve fitting) do as well. Expanding xarray beyond the numpy ufuncs opens the door to supporting these things. I'm just saying it should be conscious, deliberate decision, given the limits on developer time.
Many of these things will be pretty trivial once `.apply()` is here. So perhaps it's not a big deal.
","{""total_count"": 7, ""+1"": 7, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-260382091,https://api.github.com/repos/pydata/xarray/issues/1115,260382091,MDEyOklzc3VlQ29tbWVudDI2MDM4MjA5MQ==,1217238,2016-11-14T16:20:14Z,2016-11-14T16:20:14Z,MEMBER,"That said, correlation coefficients are a pretty fundamental operation for working with data. I could see implementing a basic `corr` in xarray and referring to a separate signal processing package for more options in the docstring.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-260333267,https://api.github.com/repos/pydata/xarray/issues/1115,260333267,MDEyOklzc3VlQ29tbWVudDI2MDMzMzI2Nw==,1197350,2016-11-14T13:24:10Z,2016-11-14T13:24:10Z,MEMBER,"I agree this would be very useful. But it is also feature creep. There is an extremely wide range of such functions that could hypothetically be put into the xarray package. (all of [scipy.signal](https://docs.scipy.org/doc/scipy-0.18.1/reference/signal.html) for example) At some point the community should decide what is the intended scope of xarray itself vs. packages built on top of xarray.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339
https://github.com/pydata/xarray/issues/1115#issuecomment-260219462,https://api.github.com/repos/pydata/xarray/issues/1115,260219462,MDEyOklzc3VlQ29tbWVudDI2MDIxOTQ2Mg==,1217238,2016-11-13T22:57:02Z,2016-11-13T22:57:02Z,MEMBER,"The first step here is to find a library that implements the desired functionality on pure NumPy arrays, ideally in a vectorized fashion. Then it should be pretty straightforward to wrap in xarray.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339