home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 1076265104 and user = 488992 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • cjauvin · 4 ✖

issue 1

  • Weighted quantile · 4 ✖

author_association 1

  • CONTRIBUTOR 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1043334868 https://github.com/pydata/xarray/pull/6059#issuecomment-1043334868 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X84-MAbU cjauvin 488992 2022-02-17T19:27:23Z 2022-02-17T19:27:23Z CONTRIBUTOR

I have added a test to verify that using equal weights with the different interpolation methods that this PR supports would work (i.e. would yield the same results as np.quantile, with the corresponding methods). It is skipped however because the method argument is not currently exposed in the API (it would be in future work, ideally).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1011136209 https://github.com/pydata/xarray/pull/6059#issuecomment-1011136209 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X848RLbR cjauvin 488992 2022-01-12T15:03:31Z 2022-01-12T15:03:31Z CONTRIBUTOR

@huard's latest commit modifies the algorithm so that it uses Kish's effective sample size, as described in the blog where the algorithm comes from: https://aakinshin.net/posts/kish-ess-weighted-quantiles/, which seems to solve the problem mentioned by @mathause.

Also he adds support for the interpolation types 4 to 9 (those that share a common way of computing Qp, as described here: https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
992713850 https://github.com/pydata/xarray/pull/6059#issuecomment-992713850 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X847K5x6 cjauvin 488992 2021-12-13T17:39:03Z 2021-12-13T17:39:03Z CONTRIBUTOR

@mathause About this:

I did some tries and got an unexpected result:

```python data = xr.DataArray([0, 1, 2, 3]) weights = xr.DataArray([1, 0, 1, 0]) data.weighted(weights).quantile([0.75])

np.quantile([0, 2], 0.75) ```

Can you double-check? Or do I misunderstand something?

My latest commit should fix (and test) this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
991714854 https://github.com/pydata/xarray/pull/6059#issuecomment-991714854 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X847HF4m cjauvin 488992 2021-12-11T17:08:55Z 2021-12-11T17:08:55Z CONTRIBUTOR

@mathause Thanks for the many excellent suggestions! After having removed the for loop the way you suggested, I tried to address this:

The algorithm is quite clever but it multiplies all elements (except 2) with 0 - this could maybe be sped up by only using the relevant elements.

At first I thought that something like this could work:

python w = np.diff(v) nz = np.nonzero(w) d2 = np.tile(data, (n, 1)) r = w[nz] * d2[nz] r = r[::2] + r[1::2]

The problem however is that it turns out that w's rows sometimes have one element only, instead of two (which is when an h coincides exactly with a weight value, instead of lying between two). Given that difficulty, my impression is that it's not really solvable, or at least not in a way that would result in a more efficient version.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 24.025ms · About: xarray-datasette