home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where author_association = "MEMBER", issue = 525685973 and user = 5635139 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • max-sixty · 12 ✖

issue 1

  • cov() and corr() - finalization · 12 ✖

author_association 1

  • MEMBER · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
633652253 https://github.com/pydata/xarray/pull/3550#issuecomment-633652253 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDYzMzY1MjI1Mw== max-sixty 5635139 2020-05-25T16:57:04Z 2020-05-25T16:57:04Z MEMBER

@r-beer in case you come back to see this: thank you for taking it so far; your code was helpful to eventually getting this feature in. And we'd of course appreciate any additional contributions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
557833141 https://github.com/pydata/xarray/pull/3550#issuecomment-557833141 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzgzMzE0MQ== max-sixty 5635139 2019-11-23T21:10:37Z 2019-11-23T21:10:37Z MEMBER

Either way, it would be good to have a clear list of da_a and da_b and respective expected results. I am a bit confused about what we actually would expect for xarray.DataArray([[1, 2], [np.nan, np.nan]], dims=['x', 'y']).

For xarray.corr(array, array, dim='y'), I think we'd expect an array of the same dimensionality as below, with one point as the correlation of 1.0 (because {1,2} is 100% correlated to {1,2}) and another NaN. Does that make sense?

In [17]: xr.dot(da,da, dims='y') Out[17]: <xarray.DataArray (x: 2)> array([ 5., nan]) Dimensions without coordinates: x

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
557832407 https://github.com/pydata/xarray/pull/3550#issuecomment-557832407 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzgzMjQwNw== max-sixty 5635139 2019-11-23T21:00:12Z 2019-11-23T21:00:12Z MEMBER

@shoyer good spot, thanks @r-beer we could add that as a test case

Yes, I will add it.

Thinking about test arrays in general: Wouldn't it be good to define some data array fixtures in the conftest.py which can then be used in all different test_function.py tests?

I think we could do more of this, yes. It's a tradeoff between (a) better sharing & performance vs. (b) customization for each function & localization of effects. We have def create_test_data in a few files, before we used pytest, so could convert some of the usages to fixtures. @shoyer and I discussed this briefly with @keewis 's work on tests for units. I think @shoyer is keener on having fixture-like code close to the test functions.

In view of the additional test I started to restructure the test_cov and test_corr to a single test_func. I am now asking myself how much of the data preprocessing shall go to the parametrization and how much remains in the test function...

Any guidelines or suggestions?

I would start with having the code in the function; i.e. don't abstract too early unless we're really confident on the eventual state. When we find ourselves something multiple times, then we can start pulling it out into more abstractions (e.g. fixtures).

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
557824109 https://github.com/pydata/xarray/pull/3550#issuecomment-557824109 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzgyNDEwOQ== max-sixty 5635139 2019-11-23T19:02:53Z 2019-11-23T19:02:53Z MEMBER

@shoyer good spot, thanks

@r-beer we could add that as a test case

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
557808929 https://github.com/pydata/xarray/pull/3550#issuecomment-557808929 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzgwODkyOQ== max-sixty 5635139 2019-11-23T15:51:48Z 2019-11-23T15:51:48Z MEMBER

@r-beer great list of future PRs!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
557807868 https://github.com/pydata/xarray/pull/3550#issuecomment-557807868 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NzgwNzg2OA== max-sixty 5635139 2019-11-23T15:38:17Z 2019-11-23T15:38:17Z MEMBER

@shoyer take a glance if you get a second. I'm deliberately pushing this through even though https://github.com/pydata/xarray/pull/3550/files#diff-293280a1ed0154c9fb35b842f69e7e33R1112 isn't resolved, since I think these are important features which have dragged for years, and we can iterate on the performance (and @r-beer has done a great job on his first PR!)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556492208 https://github.com/pydata/xarray/pull/3550#issuecomment-556492208 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjQ5MjIwOA== max-sixty 5635139 2019-11-20T22:34:14Z 2019-11-20T22:34:14Z MEMBER

Oh, ok. I thought I have to create xarray-docs to update whats-new.rst and api.rst?

You can just edit the files directly! No need to build all the docs. It should be much simpler than the work you've already done!

(lmk if the docs were unclear and we can update)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556477633 https://github.com/pydata/xarray/pull/3550#issuecomment-556477633 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjQ3NzYzMw== max-sixty 5635139 2019-11-20T22:20:35Z 2019-11-20T22:20:35Z MEMBER

@r-beer let's go with @shoyer 's suggestion to change this from a method to a function xr.cov & xr.corr. Is that OK?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556476893 https://github.com/pydata/xarray/pull/3550#issuecomment-556476893 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjQ3Njg5Mw== max-sixty 5635139 2019-11-20T22:19:54Z 2019-11-20T22:19:54Z MEMBER

I am now looking into the generation of the documentation.

What do you mean by this? You shouldn't have to generate anything locally?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556415424 https://github.com/pydata/xarray/pull/3550#issuecomment-556415424 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjQxNTQyNA== max-sixty 5635139 2019-11-20T21:23:39Z 2019-11-20T21:23:39Z MEMBER

@r-beer your current test case looks fine, no stress on the fixtures!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556413367 https://github.com/pydata/xarray/pull/3550#issuecomment-556413367 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjQxMzM2Nw== max-sixty 5635139 2019-11-20T21:21:46Z 2019-11-20T21:21:46Z MEMBER

One question we started discussing on the old PR which we should resolve ASAP given @r-beer 's work: do we want xr.cov(da1, da2) or da1.cov(da2)?

CC @fujiisoup @shoyer

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973
556045623 https://github.com/pydata/xarray/pull/3550#issuecomment-556045623 https://api.github.com/repos/pydata/xarray/issues/3550 MDEyOklzc3VlQ29tbWVudDU1NjA0NTYyMw== max-sixty 5635139 2019-11-20T15:10:41Z 2019-11-20T15:10:41Z MEMBER

Great start!

Thanks for the review @keewis

I think it's fine to leave Dataset for another PR; it'd be great to get this in, and given it's your first PR (welcome!) let's not bloat it

Test looks good. Let's make those adjustments you put in the TODO and add one for the other function? We could also add some variants which test along different dimensions / all NaNs / other special cases.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  cov() and corr() - finalization 525685973

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 108.79ms · About: xarray-datasette