home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 1223031600 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • max-sixty 2
  • sgdecker 1

author_association 2

  • MEMBER 2
  • NONE 1

issue 1

  • Excessive memory consumption by to_dataframe() · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1116350454 https://github.com/pydata/xarray/issues/6561#issuecomment-1116350454 https://api.github.com/repos/pydata/xarray/issues/6561 IC_kwDOAMm_X85Ciif2 max-sixty 5635139 2022-05-03T17:19:23Z 2022-05-03T17:19:23Z MEMBER

Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.

I'm not sure it's necessarily poorly constructed — it can be quite useful to structure data like this — having aligned data of different dimensions in a single dataset is great. But the attribute of the data that makes datasets a good format also makes it bad for a single table.

Probably what we'd want is to_dataframes(), which would create a dataframe for each combination of dimensions...

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory consumption by to_dataframe() 1223031600
1116344892 https://github.com/pydata/xarray/issues/6561#issuecomment-1116344892 https://api.github.com/repos/pydata/xarray/issues/6561 IC_kwDOAMm_X85CihI8 sgdecker 8419421 2022-05-03T17:13:02Z 2022-05-03T17:13:02Z NONE

Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory consumption by to_dataframe() 1223031600
1115419268 https://github.com/pydata/xarray/issues/6561#issuecomment-1115419268 https://api.github.com/repos/pydata/xarray/issues/6561 IC_kwDOAMm_X85Ce_KE max-sixty 5635139 2022-05-02T22:09:40Z 2022-05-02T22:09:40Z MEMBER

Great, thanks for the example @sgdecker .

I think this is happening because there are variables of different dimensions that are getting broadcast together:

```python

In [5]: ncdata[['lastChild']].to_dataframe() Out[5]: lastChild station 0 127265.0 1 NaN 2 127492.0 3 124019.0 4 NaN ... ... 5016 124375.0 5017 126780.0 5018 126781.0 5019 124902.0 5020 93468.0

[5021 rows x 1 columns]

In [6]: ncdata[['lastChild','snowfall_amount']].to_dataframe() Out[6]: lastChild snowfall_amount station recNum 0 0 127265.0 NaN 1 127265.0 NaN 2 127265.0 NaN 3 127265.0 NaN 4 127265.0 NaN ... ... ... 5020 127621 93468.0 NaN 127622 93468.0 NaN 127623 93468.0 NaN 127624 93468.0 NaN 127625 93468.0 NaN

[640810146 rows x 2 columns]

```

640810146 rows is the giveaway.

I'm not sure what we could do here — I don't think there's a way of producing a 2D dataframe without blowing this out?

We could offer a warning on this behavior beyond a certain size — we'd take a PR for that...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Excessive memory consumption by to_dataframe() 1223031600

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.568ms · About: xarray-datasette