issue_comments
15 rows where issue = 323703742 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- From pandas to xarray without blowing up memory · 15 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
708616198 | https://github.com/pydata/xarray/issues/2139#issuecomment-708616198 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODYxNjE5OA== | max-sixty 5635139 | 2020-10-14T19:34:53Z | 2020-10-14T19:34:53Z | MEMBER | As you wish — if there's a motivating example then that has more weight, and big issues should have ample supply of motivating examples. That said, if you have something ready to go, then happy to take a look at it. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708594913 | https://github.com/pydata/xarray/issues/2139#issuecomment-708594913 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODU5NDkxMw== | mankoff 145117 | 2020-10-14T18:52:38Z | 2020-10-14T18:52:38Z | CONTRIBUTOR | The issue is that if you pass in This multi-index came from a small 12 MB file - 5000 rows and 40 variables. When I then did Now that I've figured all this out, I don't think that any bugs exist in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708579401 | https://github.com/pydata/xarray/issues/2139#issuecomment-708579401 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODU3OTQwMQ== | max-sixty 5635139 | 2020-10-14T18:23:16Z | 2020-10-14T18:23:16Z | MEMBER | Great! Post here / a new issue if something does come up! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708513119 | https://github.com/pydata/xarray/issues/2139#issuecomment-708513119 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODUxMzExOQ== | mankoff 145117 | 2020-10-14T16:23:36Z | 2020-10-14T16:23:36Z | CONTRIBUTOR | @max-sixty Sorry for posting this here. This memory blow-up was a byproduct of another bug that it took me a few more hours to track down. This other bug is in Pandas, not xarray. |
{ "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708499472 | https://github.com/pydata/xarray/issues/2139#issuecomment-708499472 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODQ5OTQ3Mg== | max-sixty 5635139 | 2020-10-14T16:00:35Z | 2020-10-14T16:00:35Z | MEMBER | @mankoff Thanks for the issue, do you have a fuller reproduction? I'm happy to take a look at this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
708339519 | https://github.com/pydata/xarray/issues/2139#issuecomment-708339519 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDcwODMzOTUxOQ== | mankoff 145117 | 2020-10-14T11:25:03Z | 2020-10-14T11:25:03Z | CONTRIBUTOR | Late reply, but if anyone else finds this issue, I was filling memory with:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389622523 | https://github.com/pydata/xarray/issues/2139#issuecomment-389622523 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTYyMjUyMw== | ghost 10137 | 2018-05-16T18:37:24Z | 2018-05-16T18:37:24Z | NONE | Does that sound like it will play well with GeoViews if I want widgets for the categorical vars? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389622155 | https://github.com/pydata/xarray/issues/2139#issuecomment-389622155 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTYyMjE1NQ== | ghost 10137 | 2018-05-16T18:36:17Z | 2018-05-16T18:36:17Z | NONE | Ok. Looks like the way forward is a netCDF file for each level of my categorical variables. Will give it a shot. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389620638 | https://github.com/pydata/xarray/issues/2139#issuecomment-389620638 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTYyMDYzOA== | shoyer 1217238 | 2018-05-16T18:31:35Z | 2018-05-16T18:31:35Z | MEMBER | MetaCSV looks interesting but I haven't used it myself. My guess would be that it just wraps pandas/xarray for processing data, so I think it's unlikely to give a performance boost. It's more about a declarative way to specify how to load a CSV into pandas/xarray. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389618279 | https://github.com/pydata/xarray/issues/2139#issuecomment-389618279 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTYxODI3OQ== | ghost 10137 | 2018-05-16T18:24:02Z | 2018-05-16T18:24:02Z | NONE | @shoyer Thank you. Does metacsv look likely to work to you? It has attracted almost no attention so I wonder if it will exhaust memory. I'm kind of surprised this path (csv -> xarray) isn't better fleshed out as I would have expected it to be very common, perhaps the most common esp. for "found data." |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389598338 | https://github.com/pydata/xarray/issues/2139#issuecomment-389598338 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTU5ODMzOA== | shoyer 1217238 | 2018-05-16T17:20:03Z | 2018-05-16T17:20:03Z | MEMBER | If you don't want the full Cartesian product, you need to ensure that the index only contains the variables you want to expand into a grid, e.g., time, lat and lon. If the problem is only running out of memory (which is indeed likely with 1e9 rows), then you'll need to think about a more clever way to convert the data. One good option might be to groups over subsets of the data (using dask or another parallel processing library like spark or beam), and write a bunch of smaller netCDF which you then open with xarray's |
{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389596244 | https://github.com/pydata/xarray/issues/2139#issuecomment-389596244 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTU5NjI0NA== | ghost 10137 | 2018-05-16T17:13:11Z | 2018-05-16T17:13:11Z | NONE | This looks potentially helpful http://metacsv.readthedocs.io/en/latest/ |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389592602 | https://github.com/pydata/xarray/issues/2139#issuecomment-389592602 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTU5MjYwMg== | ghost 10137 | 2018-05-16T17:01:37Z | 2018-05-16T17:01:37Z | NONE | PS: I started with Dask but haven't found a way to go from Dask to xarray. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389592243 | https://github.com/pydata/xarray/issues/2139#issuecomment-389592243 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTU5MjI0Mw== | ghost 10137 | 2018-05-16T17:00:24Z | 2018-05-16T17:00:24Z | NONE | Hi @jhamman The original data is literally just a flat csv file with ie: lat,lon,epoch,cat1,cat2,var1,var2,...,var50 with 1 billion rows. I'm looking to xarray for GeoViews, which I think would benefit from having the data properly grouped/indexed by its categories |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 | |
389590507 | https://github.com/pydata/xarray/issues/2139#issuecomment-389590507 | https://api.github.com/repos/pydata/xarray/issues/2139 | MDEyOklzc3VlQ29tbWVudDM4OTU5MDUwNw== | jhamman 2443309 | 2018-05-16T16:55:27Z | 2018-05-16T16:55:27Z | MEMBER | @brianmingus - any chance you can provide a reproducible example with some dummy data? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
From pandas to xarray without blowing up memory 323703742 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 5