html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/3550#issuecomment-557895005,https://api.github.com/repos/pydata/xarray/issues/3550,557895005,MDEyOklzc3VlQ29tbWVudDU1Nzg5NTAwNQ==,45787861,2019-11-24T14:41:44Z,2019-11-24T14:41:44Z,NONE,"I have added several test cases, almost all pass. The ones that don't pass are related to `dim=""time""` and incompletely implemented `pandas_cov` and `pandas_corr`, that currently don't have a `dim` parameter. I had a look at `np.cov`, `np.corrcoef`, `pd.Series.cov`, `pd.frame.cov`, `pd.Series.corr`, and `pd.frame.corr`: Computation of `cov` and `corr` is implemented in many different ways. `xr.cov` and `xr.corr` now imitate the behavior of the respective `pd.Series` functions. I actually prefer the numpy behavior, resulting in covariance and correlation matrices. However I feel that efficient implementation of this behavior is above my current understanding of xarray. So, I would highly appreciate your support on this implementation! Otherwise, I added the `ddof` parameter to `xr.cov` to imitate the behavior of the respective `pd.Series` functions. I now get the results that `pd.Series` produces. However, I had to set `ddof=1` for `xr.cov` and `ddof=0` for `xr.corr`. Does that make sense?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557874527,https://api.github.com/repos/pydata/xarray/issues/3550,557874527,MDEyOklzc3VlQ29tbWVudDU1Nzg3NDUyNw==,45787861,2019-11-24T10:11:31Z,2019-11-24T10:15:19Z,NONE,"> For `xarray.corr(array, array, dim='y')`, I think we'd expect an array of the same dimensionality as below, with one point as the correlation of 1.0 (because `{1,2}` is 100% correlated to `{1,2}`) and another `NaN`. Does that make sense? On the one side, I am with you in terms of ""What You Put In Is What You Get Out"", on the other hand `pandas.Series.cov` and `np.cov` have other behaviors that both seem plausible:
![grafik](https://user-images.githubusercontent.com/45787861/69493085-a1342600-0eaa-11ea-991a-f23f4ae348e3.png) ![grafik](https://user-images.githubusercontent.com/45787861/69493094-b3ae5f80-0eaa-11ea-8984-6710e2331882.png)
So for the moment, we might stick with `pandas.Series.cov`? Or (future PR?) we rather want the `np.cov` behavior, this would require more effort and the `repr` should probably be given as `xarray` instead of `np.array`, right? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557872510,https://api.github.com/repos/pydata/xarray/issues/3550,557872510,MDEyOklzc3VlQ29tbWVudDU1Nzg3MjUxMA==,45787861,2019-11-24T09:39:30Z,2019-11-24T09:39:30Z,NONE,"> I would start with having the code in the function; i.e. don't abstract too early unless we're really confident on the eventual state. When we find ourselves something multiple times, then we can start pulling it out into more abstractions (e.g. fixtures). Alright! I found myself having very similar tests and therefore abstracted it to test_func. I completely agree with not abstracting too early and but when having the same structure several times. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557829302,https://api.github.com/repos/pydata/xarray/issues/3550,557829302,MDEyOklzc3VlQ29tbWVudDU1NzgyOTMwMg==,45787861,2019-11-23T20:13:45Z,2019-11-23T20:13:45Z,NONE,"> Either way, it would be good to have a clear list of `da_a` and `da_b` and respective `expected` results. I am a bit confused about what we actually would expect for `xarray.DataArray([[1, 2], [np.nan, np.nan]], dims=['x', 'y'])`. Maybe, as a first step some type checking would be enough, to identify such special cases as the one above and give a `ValueError` or `NotImplementedError`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557828920,https://api.github.com/repos/pydata/xarray/issues/3550,557828920,MDEyOklzc3VlQ29tbWVudDU1NzgyODkyMA==,45787861,2019-11-23T20:07:56Z,2019-11-23T20:08:11Z,NONE,"> > > @shoyer good spot, thanks > > @r-beer we could add that as a test case Yes, I will add it. Thinking about test arrays in general: Wouldn't it be good to define some data array fixtures in the `conftest.py` which can then be used in all different `test_function.py` tests? In view of the additional test I started to restructure the `test_cov` and `test_corr` to a single `test_func`. I am now asking myself how much of the data preprocessing shall go to the parametrization and how much remains in the test function... Any guidelines or suggestions? Either way, it would be good to have a clear list of `da_a` and `da_b` and respective `expected` results. I am a bit confused about what we actually would expect for `xarray.DataArray([[1, 2], [np.nan, np.nan]], dims=['x', 'y'])`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557812486,https://api.github.com/repos/pydata/xarray/issues/3550,557812486,MDEyOklzc3VlQ29tbWVudDU1NzgxMjQ4Ng==,45787861,2019-11-23T16:33:58Z,2019-11-23T16:34:39Z,NONE,"Turns out the `assert np.allclose` statements actually failed for `cov`, but not for `corr`: ```bash actual = xr.cov(ts1, ts2) > assert np.allclose(expected, actual) E assert False E + where False = (0.004632253406505826, \narray(0.004323)) E + where = np.allclose ``` 0.00463 vs. 0.00432","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557790321,https://api.github.com/repos/pydata/xarray/issues/3550,557790321,MDEyOklzc3VlQ29tbWVudDU1Nzc5MDMyMQ==,45787861,2019-11-23T11:34:12Z,2019-11-23T11:34:12Z,NONE,"> > > PS: I personally would like to use cov and corr also with dask. Does it also make sense to put this in a future PR, together with the above-mentioned other improvements? > > Further PRs: > > * Add `dot`, `cov` and `corr` to `computation.rst` > > PS: It looks like `xr.dot` is only present in the ""API reference"" and not in https://xarray.pydata.org/en/stable/computation.html. We should add `dot`, `cov` and `corr` to `computation.rst` in a future PR. > > * Test (and if necessary implement) `dask` compatibility to `cov` and `corr` > > * Improve documentation on `How to contribute to the xarray documentation?` > > I think it already gives a good overview, but then it might be unclear how to proceed. > > Maybe it might make sense to add a subsection `How to contribute to the xarray documentation?` in which the different options are shortly described examplary: > > ``` > > 1. change docstring, e.g. .... > > 2. update api.rst, e.g. .... > > 3. update whats-new.rst, e.g. ... > > ``` > > > > > > Then, underline that building the documentation locally is optional: > > ``` > > Optionally, you can build the docs locally as described in the next section. > > ``` Further PRs: - Add dot, cov and corr to computation.rst PS: It looks like xr.dot is only present in the ""API reference"" and not in https://xarray.pydata.org/en/stable/computation.html. We should add dot, cov and corr to computation.rst in a future PR. - Test (and if necessary implement) dask compatibility to cov and corr - Improve documentation on How to contribute to the xarray documentation? I think it already gives a good overview, but then it might be unclear how to proceed. Maybe it might make sense to add a subsection How to contribute to the xarray documentation? in which the different options are shortly described examplary: 1. change docstring, e.g. .... 2. update api.rst, e.g. .... 3. update whats-new.rst, e.g. ... Then, underline that building the documentation locally is optional: Optionally, you can build the docs locally as described in the next section. - Extend xarray functionality, such that pairwise correlations and covariances can be easily calculated. Either write a wrapper function or extend xr.corr and xr.cov, such that the following computation is supported with ease: ```python pairwise_dim = 'space' for region in da_a[pairwise_dim]: print(xr.corr(da_a.sel(space=region.item()), da_a, dim='time').values) ``` Target output would be an xarray with pairwise correlations, e.g.: ```bash [ 1. -0.43942827 -0.22425502] [-0.43942827 1. 0.30940081] [-0.22425502 0.30940081 1. ] ```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557787104,https://api.github.com/repos/pydata/xarray/issues/3550,557787104,MDEyOklzc3VlQ29tbWVudDU1Nzc4NzEwNA==,45787861,2019-11-23T10:45:17Z,2019-11-23T10:45:17Z,NONE,"@dcherian, @max-sixty, @keewis, @Hoeze, second try: any more improvements required?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557781732,https://api.github.com/repos/pydata/xarray/issues/3550,557781732,MDEyOklzc3VlQ29tbWVudDU1Nzc4MTczMg==,45787861,2019-11-23T09:23:46Z,2019-11-23T09:23:46Z,NONE,"> > > > I'm not sure what's going on with the formatting. If you run ` black --version` and `black .` from the root of the repo, what do you get? > > ` black --version`: > > ```shell > black, version 19.10b0 > ``` > > `black .`: > > ```shell > reformatted ...\xarray\xarray\core\dataarray.py > All done! ✨ �🍰✨✨ > 1 file reformatted, 143 files left unchanged. > ``` Now, I additionally get errors from pre-commit claiming that `GIT: unstaged files detected`. Guess, the pre-commit installation was not completely correct?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557777317,https://api.github.com/repos/pydata/xarray/issues/3550,557777317,MDEyOklzc3VlQ29tbWVudDU1Nzc3NzMxNw==,45787861,2019-11-23T08:08:44Z,2019-11-23T08:08:44Z,NONE,"> I'm not sure what's going on with the formatting. If you run ` black --version` and `black .` from the root of the repo, what do you get? ` black --version`: ```bash black, version 19.10b0 ``` `black .`: ```bash reformatted ...\xarray\xarray\core\dataarray.py All done! ✨ �🍰✨✨ 1 file reformatted, 143 files left unchanged. ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557706382,https://api.github.com/repos/pydata/xarray/issues/3550,557706382,MDEyOklzc3VlQ29tbWVudDU1NzcwNjM4Mg==,45787861,2019-11-22T21:46:25Z,2019-11-22T21:46:25Z,NONE,"PS: I personally would like to use cov and corr also with dask. Does it also make sense to put this in a future PR, together with the above-mentioned other improvements? Further PRs: - Add `dot`, `cov` and `corr` to `computation.rst` > PS: It looks like `xr.dot` is only present in the ""API reference"" and not in https://xarray.pydata.org/en/stable/computation.html. We should add `dot`, `cov` and `corr` to `computation.rst` in a future PR. - Test (and if necessary implement) `dask` compatibility to `cov` and `corr` - Improve documentation on `How to contribute to the xarray documentation?` > I think it already gives a good overview, but then it might be unclear how to proceed. > > Maybe it might make sense to add a subsection `How to contribute to the xarray documentation?` in which the different options are shortly described examplary: > > ``` > 1. change docstring, e.g. .... > 2. update api.rst, e.g. .... > 3. update whats-new.rst, e.g. ... > ``` > > Then, underline that building the documentation locally is optional: > > ``` > Optionally, you can build the docs locally as described in the next section. > ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557701008,https://api.github.com/repos/pydata/xarray/issues/3550,557701008,MDEyOklzc3VlQ29tbWVudDU1NzcwMTAwOA==,45787861,2019-11-22T21:27:59Z,2019-11-22T21:27:59Z,NONE,"@dcherian, @max-sixty, @keewis, @Hoeze, this is it, is it? Let me know if there are some adjustments to be made. 🙂 And I still get `trailing whitespace errors`, despite having set up pre-commit to run black and flake8 automatically. Do I have to set some parameters to delete the trailing whitespaces?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-557085576,https://api.github.com/repos/pydata/xarray/issues/3550,557085576,MDEyOklzc3VlQ29tbWVudDU1NzA4NTU3Ng==,45787861,2019-11-21T13:31:19Z,2019-11-21T13:31:19Z,NONE,"> > > Computation? Where is `xr.dot` documented? `xr.dot` is implemented in computation, what do you mean by ""documented""?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556991085,https://api.github.com/repos/pydata/xarray/issues/3550,556991085,MDEyOklzc3VlQ29tbWVudDU1Njk5MTA4NQ==,45787861,2019-11-21T09:14:20Z,2019-11-21T09:14:20Z,NONE,PS: accidently closed and commented instead of commenting,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556990852,https://api.github.com/repos/pydata/xarray/issues/3550,556990852,MDEyOklzc3VlQ29tbWVudDU1Njk5MDg1Mg==,45787861,2019-11-21T09:13:44Z,2019-11-21T09:14:04Z,NONE,"> @r-beer let's go with @shoyer 's suggestion to change this from a method to a function `xr.cov` & `xr.corr`. Is that OK? @shoyer, @max-sixty, @keewis, where should these functions be placed? my first guesses: - computation - dataarray","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556517301,https://api.github.com/repos/pydata/xarray/issues/3550,556517301,MDEyOklzc3VlQ29tbWVudDU1NjUxNzMwMQ==,45787861,2019-11-20T22:58:16Z,2019-11-20T22:58:16Z,NONE,"> (lmk if the docs were unclear and we can update) I think it already gives a good overview, but then it might be unclear how to proceed. Maybe it might make sense to add a subsection `How to contribute to the xarray documentation?` in which the different options are shortly described examplary: ``` 1. change docstring, e.g. .... 2. update api.rst, e.g. .... 3. update whats-new.rst, e.g. ... ``` Then, underline that building the documentation locally is optional: ``` Optionally, you can build the docs locally as described in the next section. ```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556507107,https://api.github.com/repos/pydata/xarray/issues/3550,556507107,MDEyOklzc3VlQ29tbWVudDU1NjUwNzEwNw==,45787861,2019-11-20T22:48:25Z,2019-11-20T22:48:25Z,NONE,"For the sake of completeness (and to already learn the procedure) I just added some information to `whats-new.rst` and `api.rst` for the methods `da.cov` and `da.corr`. I would be happy for feedback! Tomorrow, I will rewrite the methods to functions and go through all the steps again. Good night from Germany!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556492992,https://api.github.com/repos/pydata/xarray/issues/3550,556492992,MDEyOklzc3VlQ29tbWVudDU1NjQ5Mjk5Mg==,45787861,2019-11-20T22:34:58Z,2019-11-20T22:34:58Z,NONE,"> > > @r-beer let's go with @shoyer 's suggestion to change this from a method to a function `xr.cov` & `xr.corr`. Is that OK? Sure! I actually asked myself a similar software design question in another project. Shouldn't be too much effort I guess? I will tackle it tomorrow. 😴 ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556490349,https://api.github.com/repos/pydata/xarray/issues/3550,556490349,MDEyOklzc3VlQ29tbWVudDU1NjQ5MDM0OQ==,45787861,2019-11-20T22:32:28Z,2019-11-20T22:32:28Z,NONE,"> > > > I am now looking into the generation of the documentation. > > What do you mean by this? You shouldn't have to generate anything locally? Oh, ok. I thought I have to create `xarray-docs` to update `whats-new.rst` and `api.rst`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/issues/3556#issuecomment-556487185,https://api.github.com/repos/pydata/xarray/issues/3556,556487185,MDEyOklzc3VlQ29tbWVudDU1NjQ4NzE4NQ==,45787861,2019-11-20T22:29:30Z,2019-11-20T22:29:30Z,NONE,"PS: Environment creation worked flawlessly. It's a pleasure to get into xarray contribution, thanks to you guys! 😃 ","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,526243198 https://github.com/pydata/xarray/issues/3556#issuecomment-556484204,https://api.github.com/repos/pydata/xarray/issues/3556,556484204,MDEyOklzc3VlQ29tbWVudDU1NjQ4NDIwNA==,45787861,2019-11-20T22:26:42Z,2019-11-20T22:26:42Z,NONE,"> > > @r-beer That's strange, could it be a browser cache issue? I took the screenshot from the stable version of the docs (http://xarray.pydata.org/en/stable/contributing.html#requirements). Indeed, I deleted the cache and the documentation refreshed. Thanks!","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,526243198 https://github.com/pydata/xarray/issues/3556#issuecomment-556476571,https://api.github.com/repos/pydata/xarray/issues/3556,556476571,MDEyOklzc3VlQ29tbWVudDU1NjQ3NjU3MQ==,45787861,2019-11-20T22:19:36Z,2019-11-20T22:19:36Z,NONE,"> > > I think this was already updated in #3410? Here's what I see in the contributing guidelines: > > ![](https://camo.githubusercontent.com/b7872888b93c16cc02a4a985b3d93ffcfba3fed4/68747470733a2f2f73637265656e73686f747363646e2e66697265666f7875736572636f6e74656e742e636f6d2f696d616765732f36356565333365662d623933662d346436662d623038322d6264323761376461383263342e706e67) Ok, thanks for the information. I followed the official guidelines, but will double-check with the dev version from now on.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,526243198 https://github.com/pydata/xarray/pull/3550#issuecomment-556473132,https://api.github.com/repos/pydata/xarray/issues/3550,556473132,MDEyOklzc3VlQ29tbWVudDU1NjQ3MzEzMg==,45787861,2019-11-20T22:16:24Z,2019-11-20T22:16:24Z,NONE,"> > > @r-beer your current test case looks fine, no stress on the fixtures! Alright! I am now looking into the generation of the documentation. However, it seems like the contributing guidelines are not up-to-date, as it's not possible to create the `xarray-docs` environment (see #3556).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556404945,https://api.github.com/repos/pydata/xarray/issues/3550,556404945,MDEyOklzc3VlQ29tbWVudDU1NjQwNDk0NQ==,45787861,2019-11-20T21:14:22Z,2019-11-20T21:14:22Z,NONE,"> @max-sixty, I have found the `da(request)` fixture. Do I correctly understand that the tests you mentioned above can be implemented using this fixture and parametrize decorators as in `@pytest.mark.parametrize(""da"", (1, 2), indirect=True)`? I have tried to wrap my head around the elegant use of fixtures and parametrize but didn't get there yet. If you have some time to show a more elegant solution, I would be happy to learn! If not, I guess this should work, too. Are the examples OK like this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556265324,https://api.github.com/repos/pydata/xarray/issues/3550,556265324,MDEyOklzc3VlQ29tbWVudDU1NjI2NTMyNA==,45787861,2019-11-20T19:13:20Z,2019-11-20T19:13:20Z,NONE,"> > > I'm not sure about the examples in the `examples` folder, I think those are real-world examples? > > What I meant was examples like those in [`DataArray.integrate`](http://xarray.pydata.org/en/stable/generated/xarray.Dataset.integrate.html#xarray.Dataset.integrate) ([source](https://github.com/pydata/xarray/blob/6b70107ab3063187b663290538c0d5a4107dab6e/xarray/core/dataarray.py#L3063-L3110)) that simply demonstrate how to use that function. OK! That simplifies the tasks quite a bit 🙂 I will implement this and the special cases for the tests, I might add Jupyter Notebook examples later to the documentation, when the `dataset.corr` functionality is also implemented.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556248781,https://api.github.com/repos/pydata/xarray/issues/3550,556248781,MDEyOklzc3VlQ29tbWVudDU1NjI0ODc4MQ==,45787861,2019-11-20T18:59:08Z,2019-11-20T18:59:08Z,NONE,"> We could also add some variants which test along different dimensions / all NaNs / other special cases. @max-sixty, I have found the `da(request)` fixture. Do I correctly understand that the tests you mentioned above can be implemented using this fixture and parametrize decorators as in `@pytest.mark.parametrize(""da"", (1, 2), indirect=True)`?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556244906,https://api.github.com/repos/pydata/xarray/issues/3550,556244906,MDEyOklzc3VlQ29tbWVudDU1NjI0NDkwNg==,45787861,2019-11-20T18:55:57Z,2019-11-20T18:55:57Z,NONE,"Concerning the examples, I have started to create a `Jupyter Notebook` in the examples folder. Is this the right way to create the examples? I have seen that the existing examples contain html code. Why is that? Why not only markdown?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556094542,https://api.github.com/repos/pydata/xarray/issues/3550,556094542,MDEyOklzc3VlQ29tbWVudDU1NjA5NDU0Mg==,45787861,2019-11-20T16:48:16Z,2019-11-20T16:48:55Z,NONE,"> Test looks good. Let's make those adjustments you put in the `TODO` and add one for the other function? @max-sixty, do you mean to add a TODO to the function mentioned in this commit: https://github.com/pydata/xarray/pull/3550/commits/d57eb7cfaf1407179080f0ca95e7951e5622b493? Here, I think the `@requires_scipy_or_netCDF4` is actually necessary: ```python @requires_scipy_or_netCDF4 @requires_dask @pytest.mark.filterwarnings(""ignore:use make_scale(name) instead"") def test_open_mfdataset_manyfiles( ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556050094,https://api.github.com/repos/pydata/xarray/issues/3550,556050094,MDEyOklzc3VlQ29tbWVudDU1NjA1MDA5NA==,45787861,2019-11-20T15:19:16Z,2019-11-20T15:19:16Z,NONE,"> > > Great start! > > Thanks for the review @keewis > > I think it's fine to leave `Dataset` for another PR; it'd be great to get this in, and given it's your first PR (welcome!) let's not bloat it > > Test looks good. Let's make those adjustments you put in the `TODO` and add one for the other function? We could also add some variants which test along different dimensions / all NaNs / other special cases. Thank you two, @max-sixty and @keewis, for the constructive feedback that made the start easier! I would also be happy to go once through the whole process and then implement the PR for the ds separately. This will probably cut the learning curve and accelerate the process. Additionally, I wonder whether I have to configure black for the spaces between colons? I had autoformatting running with autopep8 previously but since one of the last VS code updates something broke so I will set this up properly again on Friday or tomorrow. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-556010240,https://api.github.com/repos/pydata/xarray/issues/3550,556010240,MDEyOklzc3VlQ29tbWVudDU1NjAxMDI0MA==,45787861,2019-11-20T13:47:24Z,2019-11-20T13:47:24Z,NONE,"> > > since the APIs of `DataArray` and `Dataset` are usually kept in sync as much as possible, it might be good to add the new methods to `Dataset`, too. This doesn't have to be in this PR, though, and I'd wait for confirmation first. Yes, the functionality that I would be very interested in would be to calculate `corr()` also between different variables of a dataset as in [pandas.DataFrame.corr](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.corr.html) as well as correlations between different locations within the same `DataArray` (e.g. correlations of time series with different latitudes).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-555999418,https://api.github.com/repos/pydata/xarray/issues/3550,555999418,MDEyOklzc3VlQ29tbWVudDU1NTk5OTQxOA==,45787861,2019-11-20T13:17:37Z,2019-11-20T13:17:37Z,NONE,"> > > for the new commit, yes. The older commits could be rewritten if you don't like the other email address being public (I guide you through the process if you want) but since PRs get squashed after the merge you don't have to. OK, no need for the extra effort at this point, the squashing at the end of PR is fine for me. Thank you for the offer, anyway! 🙂 Are any other changes necessary for the PR? ideas: - examples for corr and cov usage - whats-new.rst","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-555995427,https://api.github.com/repos/pydata/xarray/issues/3550,555995427,MDEyOklzc3VlQ29tbWVudDU1NTk5NTQyNw==,45787861,2019-11-20T13:06:32Z,2019-11-20T13:06:32Z,NONE,"> > > There seems to be a problem with your email address (github can't match it to your account). Does it match the one you used for signing up on github? Dear @keewis, I just set my real email address also as public email address and set my local git user name and email address accordingly. Does this solve the problem?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/pull/3550#issuecomment-555966566,https://api.github.com/repos/pydata/xarray/issues/3550,555966566,MDEyOklzc3VlQ29tbWVudDU1NTk2NjU2Ng==,45787861,2019-11-20T11:39:11Z,2019-11-20T11:39:11Z,NONE,"@max-sixty, all three checks fail because `test_open_mfdataset_manyfiles` requires either netcdf4 or scipy. Shall I, therefore, add the `@requires_scipy_or_netCDF4` decorator?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525685973 https://github.com/pydata/xarray/issues/3549#issuecomment-555935575,https://api.github.com/repos/pydata/xarray/issues/3549,555935575,MDEyOklzc3VlQ29tbWVudDU1NTkzNTU3NQ==,45787861,2019-11-20T10:12:57Z,2019-11-20T10:12:57Z,NONE,Thanks for the fast response @crusaderky! ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,525642358 https://github.com/pydata/xarray/pull/2652#issuecomment-555915343,https://api.github.com/repos/pydata/xarray/issues/2652,555915343,MDEyOklzc3VlQ29tbWVudDU1NTkxNTM0Mw==,45787861,2019-11-20T09:20:39Z,2019-11-20T09:30:31Z,NONE,"Alright, I have done so and changed `basestrings` into `str`, now the tests run (mostly) through locally. ```bash 1 failed, 6539 passed, 1952 skipped, 37 xfailed, 31 warnings in 86.34s ``` **A general question concerning collaboration on existing PRs:** Should I push the changes to my fork and then create a new PR that replaces @hrishikeshac's PR or shall I push to @hrishikeshac's fork and see whether it works to update the existing PR? Or is there another option? PS: Permission to push to hrishikeshac:corr is denied for me. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183 https://github.com/pydata/xarray/pull/2652#issuecomment-555745623,https://api.github.com/repos/pydata/xarray/issues/2652,555745623,MDEyOklzc3VlQ29tbWVudDU1NTc0NTYyMw==,45787861,2019-11-19T22:27:10Z,2019-11-19T23:00:31Z,NONE,"Alright, I only got two merge conflicts in dataarray.py: minor merge conflict concerning imports: 1. accessors -> accessors_td 2. broadcast has been dropped in master?
```python <<<<<<< HEAD from . import ( computation, dtypes, groupby, indexing, ops, pdcompat, resample, rolling, utils, ) from .accessor_dt import DatetimeAccessor from .accessor_str import StringAccessor from .alignment import ( _broadcast_helper, _get_broadcast_dims_map_common_coords, align, reindex_like_indexers, ) ======= from .accessors import DatetimeAccessor from .alignment import align, reindex_like_indexers, broadcast >>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest. ```
Secondly, some bigger merge conflicts concerning some of dataarray's methods, but they seem to be not in conflict with each other: 1. `integrate()`, `unify_chunks()` and `map_blocks` added in master 2. `cov()` and `corr()` added in corr branch 3. It seems like `str = property(StringAccessor)` should be at the end of dataarray's definition, for mypy reasons...
``` <<<<<<< HEAD def integrate( self, dim: Union[Hashable, Sequence[Hashable]], datetime_unit: str = None ) -> ""DataArray"": """""" integrate the array with the trapezoidal rule. .. note:: This feature is limited to simple cartesian geometry, i.e. dim must be one dimensional. Parameters ---------- dim: hashable, or a sequence of hashable Coordinate(s) used for the integration. datetime_unit: str, optional Can be used to specify the unit if datetime coordinate is used. One of {'Y', 'M', 'W', 'D', 'h', 'm', 's', 'ms', 'us', 'ns', 'ps', 'fs', 'as'} Returns ------- integrated: DataArray See also -------- numpy.trapz: corresponding numpy function Examples -------- >>> da = xr.DataArray(np.arange(12).reshape(4, 3), dims=['x', 'y'], ... coords={'x': [0, 0.1, 1.1, 1.2]}) >>> da array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) Coordinates: * x (x) float64 0.0 0.1 1.1 1.2 Dimensions without coordinates: y >>> >>> da.integrate('x') array([5.4, 6.6, 7.8]) Dimensions without coordinates: y """""" ds = self._to_temp_dataset().integrate(dim, datetime_unit) return self._from_temp_dataset(ds) def unify_chunks(self) -> ""DataArray"": """""" Unify chunk size along all chunked dimensions of this DataArray. Returns ------- DataArray with consistent chunk sizes for all dask-array variables See Also -------- dask.array.core.unify_chunks """""" ds = self._to_temp_dataset().unify_chunks() return self._from_temp_dataset(ds) def map_blocks( self, func: ""Callable[..., T_DSorDA]"", args: Sequence[Any] = (), kwargs: Mapping[str, Any] = None, ) -> ""T_DSorDA"": """""" Apply a function to each chunk of this DataArray. This method is experimental and its signature may change. Parameters ---------- func: callable User-provided function that accepts a DataArray as its first parameter. The function will receive a subset of this DataArray, corresponding to one chunk along each chunked dimension. ``func`` will be executed as ``func(obj_subset, *args, **kwargs)``. The function will be first run on mocked-up data, that looks like this array but has sizes 0, to determine properties of the returned object such as dtype, variable names, new dimensions and new indexes (if any). This function must return either a single DataArray or a single Dataset. This function cannot change size of existing dimensions, or add new chunked dimensions. args: Sequence Passed verbatim to func after unpacking, after the sliced DataArray. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed. kwargs: Mapping Passed verbatim to func after unpacking. xarray objects, if any, will not be split by chunks. Passing dask collections is not allowed. Returns ------- A single DataArray or Dataset with dask backend, reassembled from the outputs of the function. Notes ----- This method is designed for when one needs to manipulate a whole xarray object within each chunk. In the more common case where one can work on numpy arrays, it is recommended to use apply_ufunc. If none of the variables in this DataArray is backed by dask, calling this method is equivalent to calling ``func(self, *args, **kwargs)``. See Also -------- dask.array.map_blocks, xarray.apply_ufunc, xarray.map_blocks, xarray.Dataset.map_blocks """""" from .parallel import map_blocks return map_blocks(func, self, args, kwargs) # this needs to be at the end, or mypy will confuse with `str` # https://mypy.readthedocs.io/en/latest/common_issues.html#dealing-with-conflicting-names str = property(StringAccessor) ======= def cov(self, other, dim = None): """"""Compute covariance between two DataArray objects along a shared dimension. Parameters ---------- other: DataArray The other array with which the covariance will be computed dim: The dimension along which the covariance will be computed Returns ------- covariance: DataArray """""" # 1. Broadcast the two arrays self, other = broadcast(self, other) # 2. Ignore the nans valid_values = self.notnull() & other.notnull() self = self.where(valid_values, drop=True) other = other.where(valid_values, drop=True) valid_count = valid_values.sum(dim) #3. Compute mean and standard deviation along the given dim demeaned_self = self - self.mean(dim = dim) demeaned_other = other - other.mean(dim = dim) #4. Compute covariance along the given dim cov = (demeaned_self*demeaned_other).sum(dim=dim)/(valid_count) return cov def corr(self, other, dim = None): """"""Compute correlation between two DataArray objects along a shared dimension. Parameters ---------- other: DataArray The other array with which the correlation will be computed dim: The dimension along which the correlation will be computed Returns ------- correlation: DataArray """""" # 1. Broadcast the two arrays self, other = broadcast(self, other) # 2. Ignore the nans valid_values = self.notnull() & other.notnull() self = self.where(valid_values, drop=True) other = other.where(valid_values, drop=True) # 3. Compute correlation based on standard deviations and cov() self_std = self.std(dim=dim) other_std = other.std(dim=dim) return self.cov(other, dim = dim)/(self_std*other_std) >>>>>>> added da.corr() and da.cov() to dataarray.py. Test added in test_dataarray.py, and tested using pytest. ```
Can you please comment my suggested changes (accepting either changes from master or both, if no conflicts).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183 https://github.com/pydata/xarray/pull/2652#issuecomment-555737000,https://api.github.com/repos/pydata/xarray/issues/2652,555737000,MDEyOklzc3VlQ29tbWVudDU1NTczNzAwMA==,45787861,2019-11-19T22:03:46Z,2019-11-19T22:03:46Z,NONE,"> > > @max-sixty, thanks for the fast response! > > Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch ""corr"" wasn't up-to-date. > > Shall I merge changes from master or develop into corr, before looking further into the tests? I read http://xarray.pydata.org/en/stable/contributing.html, is this identical to contributing.rst? Following those guidelines, I would use the following commands to ""retrieve the changes from the master branch"": ``` git fetch upstream git rebase upstream/master ``` Where upstream = https://github.com/pydata/xarray.git?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183 https://github.com/pydata/xarray/pull/2652#issuecomment-555734331,https://api.github.com/repos/pydata/xarray/issues/2652,555734331,MDEyOklzc3VlQ29tbWVudDU1NTczNDMzMQ==,45787861,2019-11-19T21:57:02Z,2019-11-19T21:57:02Z,NONE,"@max-sixty, thanks for the fast response! Yeah, I get the traceback and already started diving into it. However, I assumed that @hrishikeshac's branch ""corr"" wasn't up-to-date. Shall I merge changes from master or develop into corr, before looking further into the tests?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183 https://github.com/pydata/xarray/pull/2652#issuecomment-555730897,https://api.github.com/repos/pydata/xarray/issues/2652,555730897,MDEyOklzc3VlQ29tbWVudDU1NTczMDg5Nw==,45787861,2019-11-19T21:48:07Z,2019-11-19T21:48:07Z,NONE,"Dear @Hoeze, I will (try to) finalize this merge request, as I am also very interested in this functionality. I am new to xarray and contribution. I downloaded @hrishikeshac's code and ran the pytest tests locally. I get `17 failed, 2088 passed, 2896 skipped, 7 xfailed, 754 warnings in 49.95s`. Is there an elegant way to share ""which tests failed where"" in order to avoid that I try to fix tests, that might already have been fixed in other branches? I will already start to get a better understanding of why the tests fail and try to fix them in the meantime. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,396102183 https://github.com/pydata/xarray/issues/1115#issuecomment-555726775,https://api.github.com/repos/pydata/xarray/issues/1115,555726775,MDEyOklzc3VlQ29tbWVudDU1NTcyNjc3NQ==,45787861,2019-11-19T21:36:42Z,2019-11-19T21:36:42Z,NONE,"> > > @r-beer would be great to finish this off! I think this would be a popular feature. You could take @hrishikeshac 's code (which is close!) and make the final changes. OK, that means to make #2652 pass, right? I downloaded the respective branch from @hrishikeshac, and ran the tests locally. See respective discussion in #2652. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339 https://github.com/pydata/xarray/issues/1115#issuecomment-555376229,https://api.github.com/repos/pydata/xarray/issues/1115,555376229,MDEyOklzc3VlQ29tbWVudDU1NTM3NjIyOQ==,45787861,2019-11-19T07:44:23Z,2019-11-19T07:45:26Z,NONE,"I am also highly interested in this function and in contributing to xarray in general! If I understand correctly, https://github.com/pydata/xarray/pull/2350 and https://github.com/pydata/xarray/pull/2652 do not solve this PR, do they? How can I help you finishing these PRs?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,188996339