home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "MEMBER" and issue = 651101286 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • keewis 2
  • max-sixty 1

issue 1

  • .to_xarray(): a 9Mb dataframe requires 30Gb ram · 3 ✖

author_association 1

  • MEMBER · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
654688921 https://github.com/pydata/xarray/issues/4203#issuecomment-654688921 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDY4ODkyMQ== keewis 14808389 2020-07-07T08:30:35Z 2020-07-07T15:38:11Z MEMBER

that's only the short repr, the values are not modified: python In [5]: da.lat Out[5]: <xarray.DataArray 'lat' (lat: 16100)> array([37.49944, 37.5004 , 37.50135, ..., 43.1014 , 43.10143, 43.10144]) Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
654210829 https://github.com/pydata/xarray/issues/4203#issuecomment-654210829 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDIxMDgyOQ== keewis 14808389 2020-07-06T12:42:43Z 2020-07-06T12:42:43Z MEMBER

thanks, that helps. First of all (unless I did something wrong with the read_csv call), there's a Unnamed: 0 column that has to be removed.

Other than that, your data seems to be quite sparse so that's an ideal fit for sparse: python In [38]: %%time ...: df = pd.read_csv("/tmp/data.csv") ...: a = df.drop("Unnamed: 0", axis=1).set_index(["lat", "lon"]) ...: a = a.stack() ...: a.index.names = ["lat", "lon", "time"] ...: a = a.sort_index() ...: a.name = "T" ...: xr.DataArray.from_series(a, sparse=True) ...: ...: CPU times: user 606 ms, sys: 63.9 ms, total: 670 ms Wall time: 670 ms Out[38]: <xarray.DataArray 'T' (lat: 16100, lon: 29959, time: 31)> <COO: shape=(16100, 29959, 31), dtype=float64, nnz=1003191, fill_value=nan> Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1 * lon (lon) float64 96.46 96.46 96.46 96.47 ... 102.6 102.6 102.6 102.6 * time (time) object '2011-01-01 00:00:00' ... '2011-01-31 00:00:00'

{
    "total_count": 3,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
653920377 https://github.com/pydata/xarray/issues/4203#issuecomment-653920377 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1MzkyMDM3Nw== max-sixty 5635139 2020-07-05T18:08:46Z 2020-07-05T18:08:46Z MEMBER

Please could you fill out the issue template, including a reproducible example? A CSV could be OK if you include the reproduction steps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.847ms · About: xarray-datasette