issue_comments
2 rows where author_association = "NONE", issue = 152040420 and user = 6079398 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- MADIS netCDF to Pandas Dataframe: ValueError: iterator is too large · 2 ✖
| id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 216092859 | https://github.com/pydata/xarray/issues/838#issuecomment-216092859 | https://api.github.com/repos/pydata/xarray/issues/838 | MDEyOklzc3VlQ29tbWVudDIxNjA5Mjg1OQ== | mogismog 6079398 | 2016-05-02T02:07:47Z | 2016-05-02T02:07:47Z | NONE | Redeeming myself (only a little bit) from my previous message here: @akrherz Was messing around with this a bit, this seems to work ok. This gets rid of unnecessary dimensions, concatenates string arrays, and turns it into a pandas DataFrame: ``` [In [1]: import xarray as xr In [2]: ds = xr.open_dataset('20160430_1600.nc', decode_cf=True, mask_and_scale=False, decode_times=False) # xarray has issue decoding the times, so you'll have to do this in pandas. In [3]: vars_to_drop = [k for k in ds.variables.iterkeys() if ('recNum' not in ds[k].dims)] In [4]: ds = ds.drop(vars_to_drop) In [5]: df = ds.to_dataframe() In [6]: df.info() <class 'pandas.core.frame.DataFrame'> Int64Index: 6277 entries, 0 to 6276 Data columns (total 93 columns): invTime 6277 non-null int32 prevRecord 6277 non-null int32 isOverflow 6277 non-null int32 secondsStage1_2 6277 non-null int32 secondsStage3 6277 non-null int32 providerId 6277 non-null object stationId 6277 non-null object handbook5Id 6277 non-null object](url) ~snip~ ``` A bit hacky, but it works. |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
MADIS netCDF to Pandas Dataframe: ValueError: iterator is too large 152040420 | |
| 216090426 | https://github.com/pydata/xarray/issues/838#issuecomment-216090426 | https://api.github.com/repos/pydata/xarray/issues/838 | MDEyOklzc3VlQ29tbWVudDIxNjA5MDQyNg== | mogismog 6079398 | 2016-05-02T01:34:38Z | 2016-05-02T01:34:38Z | NONE | @shoyer: You're right in that MADIS netCDF files are (imo) poorly formatted. There is also the issue of Unfortunately, this does mean I have to do a lot of "manual cleaning" of the netCDF file before exporting as a DataFrame, but it is easy to write a set of functions to accomplish this for you. That said, I can't c/p the exact code (for work-related reasons). I'm not sure how helpful this is, but when working with MADIS netCDF data, I more or less do the following as a workaround:
1. Open up the MADIS netCDF file, fix the Though reading over it, that is kind of a draw the owl-esque response, though. :/ |
{
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
MADIS netCDF to Pandas Dataframe: ValueError: iterator is too large 152040420 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] (
[html_url] TEXT,
[issue_url] TEXT,
[id] INTEGER PRIMARY KEY,
[node_id] TEXT,
[user] INTEGER REFERENCES [users]([id]),
[created_at] TEXT,
[updated_at] TEXT,
[author_association] TEXT,
[body] TEXT,
[reactions] TEXT,
[performed_via_github_app] TEXT,
[issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
ON [issue_comments] ([user]);
user 1