home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where author_association = "CONTRIBUTOR" and issue = 1076265104 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • huard 6
  • cjauvin 4
  • bzah 2

issue 1

  • Weighted quantile · 12 ✖

author_association 1

  • CONTRIBUTOR · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1047254700 https://github.com/pydata/xarray/pull/6059#issuecomment-1047254700 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X84-a9as huard 81219 2022-02-21T21:59:22Z 2022-02-21T21:59:22Z CONTRIBUTOR

The alignment of data and weights is not done automatically. So I agree this would be ideal, but I'll need some guidance to make it happen without the align call.

This is the error I get python ValueError: operand to apply_ufunc has required core dimensions ['dim_0', 'dim_1'], but some of these dimensions are absent on an input variable: ['dim_1']

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1043394963 https://github.com/pydata/xarray/pull/6059#issuecomment-1043394963 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X84-MPGT huard 81219 2022-02-17T20:26:50Z 2022-02-17T20:26:50Z CONTRIBUTOR

@mathause I'm not 100% confident the methods other than linear work as expected, so I suggest we do not expose method until a robust test suite can confirm results are valid.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1043334868 https://github.com/pydata/xarray/pull/6059#issuecomment-1043334868 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X84-MAbU cjauvin 488992 2022-02-17T19:27:23Z 2022-02-17T19:27:23Z CONTRIBUTOR

I have added a test to verify that using equal weights with the different interpolation methods that this PR supports would work (i.e. would yield the same results as np.quantile, with the corresponding methods). It is skipped however because the method argument is not currently exposed in the API (it would be in future work, ideally).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1035083891 https://github.com/pydata/xarray/pull/6059#issuecomment-1035083891 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X849siBz huard 81219 2022-02-10T15:52:34Z 2022-02-10T15:52:34Z CONTRIBUTOR

We have local changes that expose method in the public API, and compare results against numpy for the case with uniform weights to confirm results are identical.

Do you want us to push those changes here ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1033607850 https://github.com/pydata/xarray/pull/6059#issuecomment-1033607850 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X849m5qq bzah 16700639 2022-02-09T10:30:40Z 2022-02-09T10:40:08Z CONTRIBUTOR

@mathause This PR goes beyond what is currently implemented in numpy. For now, all weighted quantiles PR on numpy are more or less based on "linear" method (method 7) and none have been merged. I plan to work on integrating weights with the other interpolation methods but don't have the time right now. I'll probably pick some ideas from here.

As for the numerics here, everything looks good to me. The only limitations I can see are: - This only handles sampling weights, which is fine I guess. - Some interpolation methods are missing, they can be added later. - ~A nan_weighted_quantile could also be interesting to add~

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1032858964 https://github.com/pydata/xarray/pull/6059#issuecomment-1032858964 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X849kC1U huard 81219 2022-02-08T17:13:14Z 2022-02-08T17:13:14Z CONTRIBUTOR

Correct, it's not exposed yet because I don't have the bandwidth to create tests for all the different methods.

The equal weight case could be tested against numpy fairly easily though.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1032829729 https://github.com/pydata/xarray/pull/6059#issuecomment-1032829729 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X849j7sh huard 81219 2022-02-08T16:44:21Z 2022-02-08T16:44:21Z CONTRIBUTOR

@mathause Should this PR be amended to account for #6108 ?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1021975396 https://github.com/pydata/xarray/pull/6059#issuecomment-1021975396 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X8486htk bzah 16700639 2022-01-26T08:32:56Z 2022-02-07T16:57:56Z CONTRIBUTOR

FYI, weighted quantiles topic will be discussed in numpy's triage meeting of today (17:00 UTC). I'm not a maintainer but I'm sure you are welcomed to join in if you are interested. Meeting information: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1011450968 https://github.com/pydata/xarray/pull/6059#issuecomment-1011450968 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X848SYRY huard 81219 2022-01-12T21:06:01Z 2022-01-12T21:06:01Z CONTRIBUTOR

I had the same interrogation. My guess is that the DataArray.quantile and DataArrayWeighted.quantile should have similar signatures.

The way I see it is that another PR would bring the DataArray.quantile signature up to date with numpy 1.22, which includes the 9 h types, and harmonize DataArrayWeighted.quantile with it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
1011136209 https://github.com/pydata/xarray/pull/6059#issuecomment-1011136209 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X848RLbR cjauvin 488992 2022-01-12T15:03:31Z 2022-01-12T15:03:31Z CONTRIBUTOR

@huard's latest commit modifies the algorithm so that it uses Kish's effective sample size, as described in the blog where the algorithm comes from: https://aakinshin.net/posts/kish-ess-weighted-quantiles/, which seems to solve the problem mentioned by @mathause.

Also he adds support for the interpolation types 4 to 9 (those that share a common way of computing Qp, as described here: https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
992713850 https://github.com/pydata/xarray/pull/6059#issuecomment-992713850 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X847K5x6 cjauvin 488992 2021-12-13T17:39:03Z 2021-12-13T17:39:03Z CONTRIBUTOR

@mathause About this:

I did some tries and got an unexpected result:

```python data = xr.DataArray([0, 1, 2, 3]) weights = xr.DataArray([1, 0, 1, 0]) data.weighted(weights).quantile([0.75])

np.quantile([0, 2], 0.75) ```

Can you double-check? Or do I misunderstand something?

My latest commit should fix (and test) this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104
991714854 https://github.com/pydata/xarray/pull/6059#issuecomment-991714854 https://api.github.com/repos/pydata/xarray/issues/6059 IC_kwDOAMm_X847HF4m cjauvin 488992 2021-12-11T17:08:55Z 2021-12-11T17:08:55Z CONTRIBUTOR

@mathause Thanks for the many excellent suggestions! After having removed the for loop the way you suggested, I tried to address this:

The algorithm is quite clever but it multiplies all elements (except 2) with 0 - this could maybe be sped up by only using the relevant elements.

At first I thought that something like this could work:

python w = np.diff(v) nz = np.nonzero(w) d2 = np.tile(data, (n, 1)) r = w[nz] * d2[nz] r = r[::2] + r[1::2]

The problem however is that it turns out that w's rows sometimes have one element only, instead of two (which is when an h coincides exactly with a weight value, instead of lying between two). Given that difficulty, my impression is that it's not really solvable, or at least not in a way that would result in a more efficient version.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Weighted quantile 1076265104

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1523.981ms · About: xarray-datasette