html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/2652#issuecomment-633652366,https://api.github.com/repos/pydata/xarray/issues/2652,633652366,MDEyOklzc3VlQ29tbWVudDYzMzY1MjM2Ng==,5635139,2020-05-25T16:57:28Z,2020-05-25T16:57:28Z,MEMBER,@hrishikeshac in case you come back to see this: thank you for taking it so far; your code was helpful to eventually getting this feature in. And we'd of course appreciate any additional contributions. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-556046073,https://api.github.com/repos/pydata/xarray/issues/2652,556046073,MDEyOklzc3VlQ29tbWVudDU1NjA0NjA3Mw==,5635139,2019-11-20T15:11:27Z,2019-11-20T15:11:27Z,MEMBER,@r-beer great—you were right to start your own PR,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555915343,https://api.github.com/repos/pydata/xarray/issues/2652,555915343,MDEyOklzc3VlQ29tbWVudDU1NTkxNTM0Mw==,45787861,2019-11-20T09:20:39Z,2019-11-20T09:30:31Z,NONE,"Alright, I have done so and changed `basestrings` into `str`, now the tests run (mostly) through locally.
```bash
1 failed, 6539 passed, 1952 skipped, 37 xfailed, 31 warnings in 86.34s
```
**A general question concerning collaboration on existing PRs:**
Should I push the changes to my fork and then create a new PR that replaces @hrishikeshac's PR or shall I push to @hrishikeshac's fork and see whether it works to update the existing PR?
Or is there another option?
PS: Permission to push to hrishikeshac:corr is denied for me. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555756460,https://api.github.com/repos/pydata/xarray/issues/2652,555756460,MDEyOklzc3VlQ29tbWVudDU1NTc1NjQ2MA==,5635139,2019-11-19T23:02:37Z,2019-11-19T23:02:37Z,MEMBER,"Yeah those are pretty normal. Your suggestions look right: you can keep both on both sets and eliminate any duplicate imports in the first,
(FYI I edited your comment so it displayed properly, seems you need a line break after ``)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555745623,https://api.github.com/repos/pydata/xarray/issues/2652,555745623,MDEyOklzc3VlQ29tbWVudDU1NTc0NTYyMw==,45787861,2019-11-19T22:27:10Z,2019-11-19T23:00:31Z,NONE,"Alright, I only got two merge conflicts in dataarray.py:
minor merge conflict concerning imports:
1. accessors -> accessors_td
2. broadcast has been dropped in master?
```python
<<<<<<< HEAD
from . import (
computation,
dtypes,
groupby,
indexing,
ops,
pdcompat,
resample,
rolling,
utils,
)
from .accessor_dt import DatetimeAccessor
from .accessor_str import StringAccessor
from .alignment import (
_broadcast_helper,
_get_broadcast_dims_map_common_coords,
align,
reindex_like_indexers,
)
=======
from .accessors import DatetimeAccessor
from .alignment import align, reindex_like_indexers, broadcast
>>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest.
```
Secondly, some bigger merge conflicts concerning some of dataarray's methods, but they seem to be not in conflict with each other:
1. `integrate()`, `unify_chunks()` and `map_blocks` added in master
2. `cov()` and `corr()` added in corr branch
3. It seems like `str = property(StringAccessor)` should be at the end of dataarray's definition, for mypy reasons...
```
<<<<<<< HEAD
def integrate(
self, dim: Union[Hashable, Sequence[Hashable]], datetime_unit: str = None
) -> ""DataArray"":
"""""" integrate the array with the trapezoidal rule.
.. note::
This feature is limited to simple cartesian geometry, i.e. dim
must be one dimensional.
Parameters
----------
dim: hashable, or a sequence of hashable
Coordinate(s) used for the integration.
datetime_unit: str, optional
Can be used to specify the unit if datetime coordinate is used.
One of {'Y', 'M', 'W', 'D', 'h', 'm', 's', 'ms', 'us', 'ns', 'ps',
'fs', 'as'}
Returns
-------
integrated: DataArray
See also
--------
numpy.trapz: corresponding numpy function
Examples
--------
>>> da = xr.DataArray(np.arange(12).reshape(4, 3), dims=['x', 'y'],
... coords={'x': [0, 0.1, 1.1, 1.2]})
>>> da
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Coordinates:
* x (x) float64 0.0 0.1 1.1 1.2
Dimensions without coordinates: y
>>>
>>> da.integrate('x')
array([5.4, 6.6, 7.8])
Dimensions without coordinates: y
""""""
ds = self._to_temp_dataset().integrate(dim, datetime_unit)
return self._from_temp_dataset(ds)
def unify_chunks(self) -> ""DataArray"":
"""""" Unify chunk size along all chunked dimensions of this DataArray.
Returns
-------
DataArray with consistent chunk sizes for all dask-array variables
See Also
--------
dask.array.core.unify_chunks
""""""
ds = self._to_temp_dataset().unify_chunks()
return self._from_temp_dataset(ds)
def map_blocks(
self,
func: ""Callable[..., T_DSorDA]"",
args: Sequence[Any] = (),
kwargs: Mapping[str, Any] = None,
) -> ""T_DSorDA"":
""""""
Apply a function to each chunk of this DataArray. This method is experimental
and its signature may change.
Parameters
----------
func: callable
User-provided function that accepts a DataArray as its first parameter. The
function will receive a subset of this DataArray, corresponding to one chunk
along each chunked dimension. ``func`` will be executed as
``func(obj_subset, *args, **kwargs)``.
The function will be first run on mocked-up data, that looks like this array
but has sizes 0, to determine properties of the returned object such as
dtype, variable names, new dimensions and new indexes (if any).
This function must return either a single DataArray or a single Dataset.
This function cannot change size of existing dimensions, or add new chunked
dimensions.
args: Sequence
Passed verbatim to func after unpacking, after the sliced DataArray. xarray
objects, if any, will not be split by chunks. Passing dask collections is
not allowed.
kwargs: Mapping
Passed verbatim to func after unpacking. xarray objects, if any, will not be
split by chunks. Passing dask collections is not allowed.
Returns
-------
A single DataArray or Dataset with dask backend, reassembled from the outputs of
the function.
Notes
-----
This method is designed for when one needs to manipulate a whole xarray object
within each chunk. In the more common case where one can work on numpy arrays,
it is recommended to use apply_ufunc.
If none of the variables in this DataArray is backed by dask, calling this
method is equivalent to calling ``func(self, *args, **kwargs)``.
See Also
--------
dask.array.map_blocks, xarray.apply_ufunc, xarray.map_blocks,
xarray.Dataset.map_blocks
""""""
from .parallel import map_blocks
return map_blocks(func, self, args, kwargs)
# this needs to be at the end, or mypy will confuse with `str`
# https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names
str = property(StringAccessor)
=======
def cov(self, other, dim = None):
""""""Compute covariance between two DataArray objects along a shared dimension.
Parameters
----------
other: DataArray
The other array with which the covariance will be computed
dim: The dimension along which the covariance will be computed
Returns
-------
covariance: DataArray
""""""
# 1. Broadcast the two arrays
self, other = broadcast(self, other)
# 2. Ignore the nans
valid_values = self.notnull() & other.notnull()
self = self.where(valid_values, drop=True)
other = other.where(valid_values, drop=True)
valid_count = valid_values.sum(dim)
#3. Compute mean and standard deviation along the given dim
demeaned_self = self - self.mean(dim = dim)
demeaned_other = other - other.mean(dim = dim)
#4. Compute covariance along the given dim
cov = (demeaned_self*demeaned_other).sum(dim=dim)/(valid_count)
return cov
def corr(self, other, dim = None):
""""""Compute correlation between two DataArray objects along a shared dimension.
Parameters
----------
other: DataArray
The other array with which the correlation will be computed
dim: The dimension along which the correlation will be computed
Returns
-------
correlation: DataArray
""""""
# 1. Broadcast the two arrays
self, other = broadcast(self, other)
# 2. Ignore the nans
valid_values = self.notnull() & other.notnull()
self = self.where(valid_values, drop=True)
other = other.where(valid_values, drop=True)
# 3. Compute correlation based on standard deviations and cov()
self_std = self.std(dim=dim)
other_std = other.std(dim=dim)
return self.cov(other, dim = dim)/(self_std*other_std)
>>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest.
```
Can you please comment my suggested changes (accepting either changes from master or both, if no conflicts).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555738102,https://api.github.com/repos/pydata/xarray/issues/2652,555738102,MDEyOklzc3VlQ29tbWVudDU1NTczODEwMg==,5635139,2019-11-19T22:06:49Z,2019-11-19T22:06:49Z,MEMBER,Yes 100%! Let me know if that doesn't work!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555737000,https://api.github.com/repos/pydata/xarray/issues/2652,555737000,MDEyOklzc3VlQ29tbWVudDU1NTczNzAwMA==,45787861,2019-11-19T22:03:46Z,2019-11-19T22:03:46Z,NONE,">
>
> @max-sixty, thanks for the fast response!
>
> Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch ""corr"" wasn't up-to-date.
>
> Shall I merge changes from master or develop into corr, before looking further into the tests?
I read http://xarray.pydata.org/en/stable/contributing.html, is this identical to contributing.rst?
Following those guidelines, I would use the following commands to ""retrieve the changes from the master branch"":
```
git fetch upstream
git rebase upstream/master
```
Where upstream = https://github.com/pydata/xarray.git?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555734331,https://api.github.com/repos/pydata/xarray/issues/2652,555734331,MDEyOklzc3VlQ29tbWVudDU1NTczNDMzMQ==,45787861,2019-11-19T21:57:02Z,2019-11-19T21:57:02Z,NONE,"@max-sixty, thanks for the fast response!
Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch ""corr"" wasn't up-to-date.
Shall I merge changes from master or develop into corr, before looking further into the tests?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555733366,https://api.github.com/repos/pydata/xarray/issues/2652,555733366,MDEyOklzc3VlQ29tbWVudDU1NTczMzM2Ng==,5635139,2019-11-19T21:54:26Z,2019-11-19T21:54:26Z,MEMBER,"Great @r-beer , we can be helpful in getting you up & running
Given this branch has diverged from master, I would make your own fork and merge in master; looks like you'll have some minor conflicts. (more details in our contributing.rst docs, or post here if confused). You can then open up your own draft PR.
Re the tests: pytest should print a list of the tests that failed and their stack traces, do you not see anything?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-555730897,https://api.github.com/repos/pydata/xarray/issues/2652,555730897,MDEyOklzc3VlQ29tbWVudDU1NTczMDg5Nw==,45787861,2019-11-19T21:48:07Z,2019-11-19T21:48:07Z,NONE,"Dear @Hoeze, I will (try to) finalize this merge request, as I am also very interested in this functionality.
I am new to xarray and contribution. I downloaded @hrishikeshac's code and ran the pytest tests locally.
I get `17 failed, 2088 passed, 2896 skipped, 7 xfailed, 754 warnings in 49.95s`.
Is there an elegant way to share ""which tests failed where"" in order to avoid that I try to fix tests, that might already have been fixed in other branches?
I will already start to get a better understanding of why the tests fail and try to fix them in the meantime.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-511165269,https://api.github.com/repos/pydata/xarray/issues/2652,511165269,MDEyOklzc3VlQ29tbWVudDUxMTE2NTI2OQ==,1200058,2019-07-14T01:17:54Z,2019-07-14T01:17:54Z,NONE,Is this pull request still up to date?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-452782113,https://api.github.com/repos/pydata/xarray/issues/2652,452782113,MDEyOklzc3VlQ29tbWVudDQ1Mjc4MjExMw==,6334793,2019-01-09T17:32:12Z,2019-01-09T17:32:12Z,NONE,"> I also think making this a function is probably a good idea, even though it's different from pandas.
>
> One question: how should these functions align their arguments? Recall that xarray does an `inner` join for arithmetic (though there's an option to control this), and an `outer` join in most other cases. It's not entirely obvious to me what the right choice is here (or if it really even matters).
I always assumed an `inner` join is the way to go. I had initially just implemented `align`, but later changed to `broadcast` since the `align` doesn't add dimension/ labels (if missing in one of the inputs) to the output, but `broadcast` does. Without this, the `where(valid_values)` doesn't work if one input is 1-D and the other is N-D. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-452014148,https://api.github.com/repos/pydata/xarray/issues/2652,452014148,MDEyOklzc3VlQ29tbWVudDQ1MjAxNDE0OA==,1217238,2019-01-07T17:29:45Z,2019-01-07T17:29:45Z,MEMBER,"I agree that the case for DataArray.dot is questionable. It sort of makes
sense because numpy and pandas both have it as a method, but the @ operator
is a really a cleaner way to express this now that we're Python 3 only.
(Speaking of which, why don't we support @ in xarray yet? :).)
On Mon, Jan 7, 2019 at 1:43 AM Keisuke Fujii
wrote:
> @max-sixty
> I am not sure whether DataArray.dot is a right choice. But I am wondering
> for cov case, it sounds like to compute a covariance of the DataArray
> itself rather than the cross covariance with another DataArray.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451877705,https://api.github.com/repos/pydata/xarray/issues/2652,451877705,MDEyOklzc3VlQ29tbWVudDQ1MTg3NzcwNQ==,6815844,2019-01-07T09:43:17Z,2019-01-07T09:43:17Z,MEMBER,"@max-sixty
I am not sure whether `DataArray.dot` is a right choice. But I am wondering for `cov` case, it sounds like to compute a covariance of the DataArray itself rather than the cross covariance with another DataArray.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451840629,https://api.github.com/repos/pydata/xarray/issues/2652,451840629,MDEyOklzc3VlQ29tbWVudDQ1MTg0MDYyOQ==,5635139,2019-01-07T07:04:46Z,2019-01-07T07:04:46Z,MEMBER,"> Probably a function like xr.cov(x, y) is better than method?
We should be more concerned with correctness than consistency - but is having `DataArray.dot` consistent with `DataArray.cov`? If not, what's the salient difference?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451703221,https://api.github.com/repos/pydata/xarray/issues/2652,451703221,MDEyOklzc3VlQ29tbWVudDQ1MTcwMzIyMQ==,1217238,2019-01-06T00:05:49Z,2019-01-06T00:05:49Z,MEMBER,"I also think making this a function is probably a good idea, even though it's different from pandas.
One question: how should these functions align their arguments? Recall that xarray does an `inner` join for arithmetic (though there's an option to control this), and an `outer` join in most other cases. It's not entirely obvious to me what the right choice is here (or if it really even matters).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451661466,https://api.github.com/repos/pydata/xarray/issues/2652,451661466,MDEyOklzc3VlQ29tbWVudDQ1MTY2MTQ2Ng==,2448579,2019-01-05T14:44:28Z,2019-01-05T14:44:28Z,MEMBER,I agree with @fujisoup,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451658613,https://api.github.com/repos/pydata/xarray/issues/2652,451658613,MDEyOklzc3VlQ29tbWVudDQ1MTY1ODYxMw==,6815844,2019-01-05T14:07:30Z,2019-01-05T14:07:30Z,MEMBER,"I am a little worrying that users could misunderstand `cov` is for (auto-)covariance rather than cross-covariance, which we are implementing here.
Probably a function like `xr.cov(x, y)` is better than method?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451602256,https://api.github.com/repos/pydata/xarray/issues/2652,451602256,MDEyOklzc3VlQ29tbWVudDQ1MTYwMjI1Ng==,6334793,2019-01-04T23:44:10Z,2019-01-04T23:44:10Z,NONE,Made the code PEP8 compatible. Apologies for not doing so earlier. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183
https://github.com/pydata/xarray/pull/2652#issuecomment-451600030,https://api.github.com/repos/pydata/xarray/issues/2652,451600030,MDEyOklzc3VlQ29tbWVudDQ1MTYwMDAzMA==,24736507,2019-01-04T23:30:48Z,2019-01-04T23:39:41Z,NONE,"Hello @hrishikeshac! Thanks for updating the PR.
- In the file [`xarray/core/dataarray.py`](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/core/dataarray.py), following are the PEP8 issues :
> [Line 2417:80](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/core/dataarray.py#L2417): [E501](https://duckduckgo.com/?q=pep8%20E501) line too long (85 > 79 characters)
> [Line 2448:80](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/core/dataarray.py#L2448): [E501](https://duckduckgo.com/?q=pep8%20E501) line too long (86 > 79 characters)
- In the file [`xarray/tests/test_dataarray.py`](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/tests/test_dataarray.py), following are the PEP8 issues :
> [Line 3312:80](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/tests/test_dataarray.py#L3312): [E501](https://duckduckgo.com/?q=pep8%20E501) line too long (92 > 79 characters)
> [Line 3313:80](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/tests/test_dataarray.py#L3313): [E501](https://duckduckgo.com/?q=pep8%20E501) line too long (89 > 79 characters)
> [Line 3342:80](https://github.com/pydata/xarray/blob/17c1181d2a7a7b6739eeb44fad74f4b5610362e9/xarray/tests/test_dataarray.py#L3342): [E501](https://duckduckgo.com/?q=pep8%20E501) line too long (100 > 79 characters)
##### Comment last updated on January 04, 2019 at 23:39 Hours UTC","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183