home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 651101286 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • Drfengze 3
  • keewis 2
  • max-sixty 1

author_association 2

  • MEMBER 3
  • NONE 3

issue 1

  • .to_xarray(): a 9Mb dataframe requires 30Gb ram · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
654988562 https://github.com/pydata/xarray/issues/4203#issuecomment-654988562 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDk4ODU2Mg== Drfengze 15280721 2020-07-07T16:46:38Z 2020-07-08T04:39:21Z NONE

that's only the short repr, the values are not modified:

python In [5]: da.lat Out[5]: <xarray.DataArray 'lat' (lat: 16100)> array([37.49944, 37.5004 , 37.50135, ..., 43.1014 , 43.10143, 43.10144]) Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1

Thanks for help!I found sparse grids are not easy to plot, so I changed my code like Colab code, which is similar with the 'rasm' example in xr. Maybe you can show how to create this example datasets (more than the toy weather) in tutorial, which would be helpful.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
654688921 https://github.com/pydata/xarray/issues/4203#issuecomment-654688921 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDY4ODkyMQ== keewis 14808389 2020-07-07T08:30:35Z 2020-07-07T15:38:11Z MEMBER

that's only the short repr, the values are not modified: python In [5]: da.lat Out[5]: <xarray.DataArray 'lat' (lat: 16100)> array([37.49944, 37.5004 , 37.50135, ..., 43.1014 , 43.10143, 43.10144]) Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
654568514 https://github.com/pydata/xarray/issues/4203#issuecomment-654568514 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDU2ODUxNA== Drfengze 15280721 2020-07-07T02:56:22Z 2020-07-07T02:56:22Z NONE

thanks, that helps. First of all (unless I did something wrong with the read_csv call), there's a Unnamed: 0 column that has to be removed.

Other than that, your data seems to be quite sparse so that's an ideal fit for sparse:

python In [38]: %%time ...: df = pd.read_csv("/tmp/data.csv") ...: a = df.drop("Unnamed: 0", axis=1).set_index(["lat", "lon"]) ...: a = a.stack() ...: a.index.names = ["lat", "lon", "time"] ...: a = a.sort_index() ...: a.name = "T" ...: xr.DataArray.from_series(a, sparse=True) ...: ...: CPU times: user 606 ms, sys: 63.9 ms, total: 670 ms Wall time: 670 ms Out[38]: <xarray.DataArray 'T' (lat: 16100, lon: 29959, time: 31)> <COO: shape=(16100, 29959, 31), dtype=float64, nnz=1003191, fill_value=nan> Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1 * lon (lon) float64 96.46 96.46 96.46 96.47 ... 102.6 102.6 102.6 102.6 * time (time) object '2011-01-01 00:00:00' ... '2011-01-31 00:00:00'

Thanks for your codes! I noticed that only two decimal numbers are kept for lat and lon. Does it mean a resample process happened? The data is a grid with 0.005 degree resolution, can I keep the resolution in the results?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
654210829 https://github.com/pydata/xarray/issues/4203#issuecomment-654210829 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDIxMDgyOQ== keewis 14808389 2020-07-06T12:42:43Z 2020-07-06T12:42:43Z MEMBER

thanks, that helps. First of all (unless I did something wrong with the read_csv call), there's a Unnamed: 0 column that has to be removed.

Other than that, your data seems to be quite sparse so that's an ideal fit for sparse: python In [38]: %%time ...: df = pd.read_csv("/tmp/data.csv") ...: a = df.drop("Unnamed: 0", axis=1).set_index(["lat", "lon"]) ...: a = a.stack() ...: a.index.names = ["lat", "lon", "time"] ...: a = a.sort_index() ...: a.name = "T" ...: xr.DataArray.from_series(a, sparse=True) ...: ...: CPU times: user 606 ms, sys: 63.9 ms, total: 670 ms Wall time: 670 ms Out[38]: <xarray.DataArray 'T' (lat: 16100, lon: 29959, time: 31)> <COO: shape=(16100, 29959, 31), dtype=float64, nnz=1003191, fill_value=nan> Coordinates: * lat (lat) float64 37.5 37.5 37.5 37.5 37.5 ... 43.1 43.1 43.1 43.1 43.1 * lon (lon) float64 96.46 96.46 96.46 96.47 ... 102.6 102.6 102.6 102.6 * time (time) object '2011-01-01 00:00:00' ... '2011-01-31 00:00:00'

{
    "total_count": 3,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
654182532 https://github.com/pydata/xarray/issues/4203#issuecomment-654182532 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1NDE4MjUzMg== Drfengze 15280721 2020-07-06T11:41:43Z 2020-07-06T11:41:43Z NONE

Please could you fill out the issue template, including a reproducible example? A CSV could be OK if you include the reproduction steps.

Thank you, updated.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286
653920377 https://github.com/pydata/xarray/issues/4203#issuecomment-653920377 https://api.github.com/repos/pydata/xarray/issues/4203 MDEyOklzc3VlQ29tbWVudDY1MzkyMDM3Nw== max-sixty 5635139 2020-07-05T18:08:46Z 2020-07-05T18:08:46Z MEMBER

Please could you fill out the issue template, including a reproducible example? A CSV could be OK if you include the reproduction steps.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  .to_xarray(): a 9Mb dataframe requires 30Gb ram  651101286

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 10.703ms · About: xarray-datasette