home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where user = 56583917 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 2

  • Aggregating a dimension using the Quantiles method with `skipna=True` is very slow 3
  • Improve docstrings for better discoverability 1

user 1

  • maawoo · 4 ✖

author_association 1

  • CONTRIBUTOR 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1406463669 https://github.com/pydata/xarray/issues/7377#issuecomment-1406463669 https://api.github.com/repos/pydata/xarray/issues/7377 IC_kwDOAMm_X85T1O61 maawoo 56583917 2023-01-27T12:45:10Z 2024-01-03T08:41:41Z CONTRIBUTOR

Hi all, I just created a simple workaround, which might be useful for others:
https://gist.github.com/maawoo/0b34d371c3cc1960a1589ccaded868c2

It uses the _nan_quantile method of xclim and works fine for my applications. Here is a quick comparison using the same example data as in my initial post:

EDIT: I've updated the code to use numbagg instead of xclim.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Aggregating a dimension using the Quantiles method with `skipna=True` is very slow 1497031605
1362998362 https://github.com/pydata/xarray/issues/7377#issuecomment-1362998362 https://api.github.com/repos/pydata/xarray/issues/7377 IC_kwDOAMm_X85RPbRa maawoo 56583917 2022-12-22T15:52:31Z 2022-12-22T15:52:31Z CONTRIBUTOR

Thanks @arongergely! I have mentioned the numpy issue in my post above (FYI, for anyone looking for it). I was really surprised to see that it's over 2 years old and that this is now the first Xarray issue referencing it. If it's really a "well known" issue, I think it should have been somehow mentioned in the Xarray quantiles method.

I have seen the blog post and tried to use the workaround with apply_ufunc and Dask but ran into some problems. I'll revisit that when I have some time and will also check xclim. Seems to be very promising, thanks!

Happy holidays! 🎄

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating a dimension using the Quantiles method with `skipna=True` is very slow 1497031605
1357341994 https://github.com/pydata/xarray/issues/7378#issuecomment-1357341994 https://api.github.com/repos/pydata/xarray/issues/7378 IC_kwDOAMm_X85Q52Uq maawoo 56583917 2022-12-19T09:21:05Z 2022-12-19T09:21:05Z CONTRIBUTOR

Hi @TomNicholas, thanks for your response! I'd be happy to contribute and will see what I can do once I find some spare time 🙂

{
    "total_count": 2,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 2,
    "rocket": 0,
    "eyes": 0
}
  Improve docstrings for better discoverability 1497131525
1351814726 https://github.com/pydata/xarray/issues/7377#issuecomment-1351814726 https://api.github.com/repos/pydata/xarray/issues/7377 IC_kwDOAMm_X85Qkw5G maawoo 56583917 2022-12-14T17:23:10Z 2022-12-14T17:23:10Z CONTRIBUTOR

This issue has an extra layer of evilness because users will also run into this issue when they don't specify the skipna parameter and their data is a float dtype, like in my example dummy data: da.quantile(0.95, dim='time')

The documentation could be a little bit clearer. The fact that skipna=True is the default for float dtypes could easily be overlooked in my opinion:

If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes;

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Aggregating a dimension using the Quantiles method with `skipna=True` is very slow 1497031605

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 641.596ms · About: xarray-datasette