home / github

Menu
  • GraphQL API
  • Search all tables

issues

Table actions
  • GraphQL API for issues

2 rows where state = "closed", type = "pull" and user = 488992 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • pull · 2 ✖

state 1

  • closed · 2 ✖

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1076265104 PR_kwDOAMm_X84vpj53 6059 Weighted quantile cjauvin 488992 closed 0     19 2021-12-10T01:11:36Z 2022-03-27T20:36:22Z 2022-03-27T20:36:22Z CONTRIBUTOR   0 pydata/xarray/pulls/6059
  • [x] Tests added
  • [x] Passes pre-commit run --all-files
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst

This is a follow-up to https://github.com/pydata/xarray/pull/5870/, which adds a weighted quantile function.

The question of how to precisely define the weighted quantile function is surprisingly complex, and this implementation offers a compromise in terms of simplicity and compatibility:

  • The only interpolation method supported is the so-called "Type 7", as explained in https://aakinshin.net/posts/weighted-quantiles/, which proposes an R implementation, that I have adapted
  • It turns out that Type 7 is apparently the most "popular" one, at least in the Python world: it corresponds to the default linear interpolation option of numpy.quantile (https://numpy.org/doc/stable/reference/generated/numpy.quantile.html) which is also the basis of xarray's already existing non-weighted quantile function
  • I have taken care in making sure that the results of this new function, with equal weights, are equivalent to the ones of the already existing, non-weighted function (when used with its default interporlation option)

The interpolation question is so complex and confusing that entire articles have been written about it, as mentioned in the blog post above, in particular this one, which establishes the "nine types" taxoxomy, used, implicitly or not, by many software packages: https://doi.org/10.2307/2684934.

The situation seems even more complex in the NumPy world, where many discussions and suggestions are aimed toward trying to improve the consistency of the API. The current non-weighted situation has the 9 options, as well as 4 extra legacy ones: https://github.com/numpy/numpy/blob/376ad691fe4df77e502108d279872f56b30376dc/numpy/lib/function_base.py#L4177-L4203

This PR cuts the Gordian knot by offering only one interpolation option, but.. given that its implementation is based on apply_ufunc (in a very similar way to xarray's already existing non-weighted quantile function, which is also using apply_ufunc with np.quantile), in the event that np.quantile ever gains a weights keyword argument, it would be very easy to swap it. That way, xarray's weighted quantile could lose a little bit of code, and gain a plethora of interpolation options.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6059/reactions",
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 1
}
    xarray 13221727 pull
1027640127 PR_kwDOAMm_X84tQwrV 5870 Add var and std to weighted computations cjauvin 488992 closed 0     8 2021-10-15T17:13:31Z 2022-01-04T21:20:58Z 2021-10-28T11:02:54Z CONTRIBUTOR   0 pydata/xarray/pulls/5870
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst
  • [x] New functions/methods are listed in api.rst

This follows https://github.com/pydata/xarray/pull/2922 to add var, std and sum_of_squares to DataArray.weighted and Dataset.weighted. I would also like to add weighted quantile, eventually.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5870/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    xarray 13221727 pull

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 18.948ms · About: xarray-datasette