home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where author_association = "NONE", issue = 1495605827 and user = 1419010 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • ravwojdyla · 7 ✖

issue 1

  • groupby+map performance regression on MultiIndex dataset · 7 ✖

author_association 1

  • NONE · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1353074006 https://github.com/pydata/xarray/issues/7376#issuecomment-1353074006 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85QpkVW ravwojdyla 1419010 2022-12-15T13:33:44Z 2022-12-15T13:33:44Z NONE

@benbovy thanks for the context and the PR #7382, exciting to see the improvement!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1352434183 https://github.com/pydata/xarray/issues/7376#issuecomment-1352434183 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85QnIIH ravwojdyla 1419010 2022-12-15T01:18:32Z 2022-12-15T01:18:32Z NONE

Thanks @benbovy! Are you also aware of the issue with plain assign being slower on MultiIndex (comment above: https://github.com/pydata/xarray/issues/7376#issuecomment-1350446546)? Do you know what could be the issue there by any chance?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1351540523 https://github.com/pydata/xarray/issues/7376#issuecomment-1351540523 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85Qjt8r ravwojdyla 1419010 2022-12-14T14:40:18Z 2022-12-14T19:47:12Z NONE

👋 @benbovy thanks for the update. Looking at https://github.com/pydata/xarray/pull/5692, it must have been a huge effort, thank you for your work on that! Coming back to this issue, in the example above the version 2022.6.0 is about 600x slower, in our internal code, the code would not finish in a reasonable time, so that forced us to downgrade to 2022.3.0. Are you aware of any workarounds for this issue with the current code (assuming I would like to preserve MultiIndex).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1350446546 https://github.com/pydata/xarray/issues/7376#issuecomment-1350446546 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85Qfi3S ravwojdyla 1419010 2022-12-14T06:03:15Z 2022-12-14T06:06:02Z NONE

FYI this might warrant a separate issue(?), but an assign of a new DataArray e.g.: ds.assign(foo=~ds["d3"]) is also a couple of times (e.g. on 4M elements, same keys as above, ~7x slower) slower since 2022.6.0 (same commit).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1350390046 https://github.com/pydata/xarray/issues/7376#issuecomment-1350390046 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85QfVEe ravwojdyla 1419010 2022-12-14T04:44:04Z 2022-12-14T04:52:26Z NONE

And just want to point out that the stacktraces/profile look very different between 2022.3.0 and main/latest. Looks like https://github.com/pydata/xarray/blob/021c73e12cccb06c017ce6420dd043a0cfbf9f08/xarray/core/indexes.py#L185 might be fairly expensive operation. Separately there seem to be quite a bit of time spend in copy -> copy_indexes path (deep copy?).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1350378571 https://github.com/pydata/xarray/issues/7376#issuecomment-1350378571 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85QfSRL ravwojdyla 1419010 2022-12-14T04:22:28Z 2022-12-14T04:27:52Z NONE

3ead17ea9e99283e2511b65b9d864d1c7b10b3c4 (https://github.com/pydata/xarray/pull/5692) seems to be the commit that introduced this regression (cc: @benbovy)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827
1350366220 https://github.com/pydata/xarray/issues/7376#issuecomment-1350366220 https://api.github.com/repos/pydata/xarray/issues/7376 IC_kwDOAMm_X85QfPQM ravwojdyla 1419010 2022-12-14T04:04:16Z 2022-12-14T04:04:16Z NONE

Also recorded py-spy flamegraphs and exported them in speedscope format at: https://gist.github.com/ravwojdyla/3b791debd3f97707d84748446dc07e39, you can view them in https://www.speedscope.app/

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby+map performance regression on MultiIndex dataset 1495605827

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.405ms · About: xarray-datasette