issue_comments
3 rows where author_association = "NONE" and user = 8419421 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
user 1
- sgdecker · 3 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1116344892 | https://github.com/pydata/xarray/issues/6561#issuecomment-1116344892 | https://api.github.com/repos/pydata/xarray/issues/6561 | IC_kwDOAMm_X85CihI8 | sgdecker 8419421 | 2022-05-03T17:13:02Z | 2022-05-03T17:13:02Z | NONE | Thanks for the feedback and explanation. It seems the poorly constructed netCDF file is fundamentally to blame for triggering this behavior. A warning is a good idea, though. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Excessive memory consumption by to_dataframe() 1223031600 | |
1065536538 | https://github.com/pydata/xarray/issues/4043#issuecomment-1065536538 | https://api.github.com/repos/pydata/xarray/issues/4043 | IC_kwDOAMm_X84_gswa | sgdecker 8419421 | 2022-03-11T21:16:59Z | 2022-03-11T21:16:59Z | NONE | I believe I am experiencing a similar issue, although with code that I thought was smart enough to chunk the data request into smaller pieces: ``` import numpy as np import xarray as xr from dask.diagnostics import ProgressBar import intake wrf_url = ('https://rda.ucar.edu/thredds/catalog/files/g/ds612.0/' 'PGW3D/2006/catalog.xml') catalog_u = intake.open_thredds_merged(wrf_url, path=['_U_2006060']) catalog_v = intake.open_thredds_merged(wrf_url, path=['_V_2006060']) ds_u = catalog_u.to_dask() ds_u['U'] = ds_u.U.chunk("auto") ds_v = catalog_v.to_dask() ds_v['V'] = ds_v.V.chunk("auto") ds = xr.merge((ds_u, ds_v)) def unstagger(ds, var, coord, new_coord): var1 = ds[var].isel({coord: slice(None, -1)}) var2 = ds[var].isel({coord: slice(1, None)}) return ((var1 + var2) / 2).rename({coord: new_coord}) with ProgressBar(): ds['U_unstaggered'] = unstagger(ds, 'U', 'west_east_stag', 'west_east') ds['V_unstaggered'] = unstagger(ds, 'V', 'south_north_stag', 'south_north') ds['speed'] = np.hypot(ds.U_unstaggered, ds.V_unstaggered) ds.speed.isel(bottom_top=10).sel(Time='2006-06-07T18:00').plot() ``` This throws an error because, according to the RDA help folks, a request for an entire variable is made, which far exceeds their server's 500 MB request limit:
Here's the error:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Opendap access failure error 614144170 | |
613068333 | https://github.com/pydata/xarray/issues/2534#issuecomment-613068333 | https://api.github.com/repos/pydata/xarray/issues/2534 | MDEyOklzc3VlQ29tbWVudDYxMzA2ODMzMw== | sgdecker 8419421 | 2020-04-13T19:57:08Z | 2020-04-13T19:57:08Z | NONE | Here is another example:
Output is:
If I'm doing the math right, xarray is trying to allocate roughly 35 GB even though this NetCDF file is only on the order of 50 MB in size. Output of Details``` INSTALLED VERSIONS ------------------ commit: None python: 3.8.1 | packaged by conda-forge | (default, Jan 5 2020, 20:58:18) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.4.31-1-MANJARO machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.utf8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.7.3 xarray: 0.14.1 pandas: 0.25.3 numpy: 1.17.5 scipy: 1.4.1 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.1.2 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 45.1.0.post20200119 pip: 19.3.1 conda: None pytest: 5.3.4 IPython: 7.11.1 sphinx: None ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_dataframe() excessive memory usage 376370028 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
issue 3