home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 223231729 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • rafa-guedes 2
  • shoyer 1
  • stale[bot] 1

author_association 3

  • CONTRIBUTOR 2
  • MEMBER 1
  • NONE 1

issue 1

  • xr.concat consuming too much resources · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
475573383 https://github.com/pydata/xarray/issues/1379#issuecomment-475573383 https://api.github.com/repos/pydata/xarray/issues/1379 MDEyOklzc3VlQ29tbWVudDQ3NTU3MzM4Mw== stale[bot] 26384082 2019-03-22T10:45:41Z 2019-03-22T10:45:41Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.concat consuming too much resources 223231729
295993132 https://github.com/pydata/xarray/issues/1379#issuecomment-295993132 https://api.github.com/repos/pydata/xarray/issues/1379 MDEyOklzc3VlQ29tbWVudDI5NTk5MzEzMg== rafa-guedes 7799184 2017-04-21T00:54:28Z 2017-04-21T10:05:27Z CONTRIBUTOR

I realised that some of the Datasets I was trying to concatenate had different coordinate values (for coordinates that I was assuming to be the same) so I guess xr.concat was trying to align these coordinates before concatenating and the resultant Dataset ended up being much larger than it should have been. When I ensure I only concatenate Datasets with consistent coordinates, I can do it.

However still resource consumption is quite high compared to when I so the same thing with numpy arrays. The memory increased by 42% using xr.concat (against 6% using np.concatenate) and the whole processing took about 4 times longer.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.concat consuming too much resources 223231729
296111630 https://github.com/pydata/xarray/issues/1379#issuecomment-296111630 https://api.github.com/repos/pydata/xarray/issues/1379 MDEyOklzc3VlQ29tbWVudDI5NjExMTYzMA== shoyer 1217238 2017-04-21T07:42:45Z 2017-04-21T07:42:45Z MEMBER

Alignment and broadcasting means that xarray.concat is inherently going to be slower than np.concatenate. But little effort has gone into optimizing it, so it is quite likely that performance could be improved with some effort.

My guess is that some combination of automatic alignment and/or broadcasting in concat is causing the issue with exploding memory usage here. See https://github.com/pydata/xarray/issues/1354 for related discussion -- contributions would certainly be welcome here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.concat consuming too much resources 223231729
295970641 https://github.com/pydata/xarray/issues/1379#issuecomment-295970641 https://api.github.com/repos/pydata/xarray/issues/1379 MDEyOklzc3VlQ29tbWVudDI5NTk3MDY0MQ== rafa-guedes 7799184 2017-04-20T23:41:38Z 2017-04-20T23:41:38Z CONTRIBUTOR

Also, reading all Datasets into a list and then trying to concatenate this list of Datasets at once also blows memory up.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xr.concat consuming too much resources 223231729

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1999.244ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows