issue_comments
4 rows where author_association = "MEMBER" and issue = 479190812 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: reactions, created_at (date), updated_at (date)
issue 1
- open_mfdataset memory leak, very simple case. v0.12 · 4 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
520182257 | https://github.com/pydata/xarray/issues/3200#issuecomment-520182257 | https://api.github.com/repos/pydata/xarray/issues/3200 | MDEyOklzc3VlQ29tbWVudDUyMDE4MjI1Nw== | shoyer 1217238 | 2019-08-10T21:53:39Z | 2019-08-10T21:53:39Z | MEMBER | Also, if you're having memory issues I also would definitely recommend upgrading to a newer version of xarray. There was a recent fix that helps ensure that files get automatically closed when they are garbage collected, even if you don't call |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset memory leak, very simple case. v0.12 479190812 | |
520182139 | https://github.com/pydata/xarray/issues/3200#issuecomment-520182139 | https://api.github.com/repos/pydata/xarray/issues/3200 | MDEyOklzc3VlQ29tbWVudDUyMDE4MjEzOQ== | shoyer 1217238 | 2019-08-10T21:51:25Z | 2019-08-10T21:52:24Z | MEMBER | Thanks for the profiling script. I ran a few permutations of this:
- Here are some plots:
So in conclusion, it looks like there are memory leaks:
1. when using netCDF4-Python (I was also able to confirm these without using xarray at all, just using (1) looks like by far the bigger issue, which you can work around by switching to scipy or h5netcdf to read your files. (2) is an issue for xarray. We do do some caching, specifically with our backend file manager, but given that issues only seem to appear when using Note: I modified your script to xarray's file cache size to 1, which helps smooth out the memory usage: ```python def CreateTestFiles(): # create a bunch of files xlen = int(1e2) ylen = int(1e2) xdim = np.arange(xlen) ydim = np.arange(ylen)
@profile def ReadFiles(): # for i in range(100): # ds = xr.open_dataset('testfile_{}.nc'.format(i), engine='netcdf4') # ds.close() ds = xr.open_mfdataset(glob.glob('testfile_*'), engine='h5netcdf', concat_dim='time') ds.close() if name == 'main': # write out files for testing CreateTestFiles()
``` |
{ "total_count": 2, "+1": 1, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset memory leak, very simple case. v0.12 479190812 | |
520136799 | https://github.com/pydata/xarray/issues/3200#issuecomment-520136799 | https://api.github.com/repos/pydata/xarray/issues/3200 | MDEyOklzc3VlQ29tbWVudDUyMDEzNjc5OQ== | crusaderky 6213168 | 2019-08-10T10:10:11Z | 2019-08-10T10:11:18Z | MEMBER | Oh but first and foremost - CPython memory management is designed so that, when PyMem_Free() is invoked, CPython will hold on to it and not invoke the underlying free() syscall, hoping to reuse it on the next PyMem_Alloc(). An increase in RAM usage from 160 to 200MB could very well be explained by this. Try increasing the number of loops in your test 100-fold and see if you get a 100-fold increase in memory usage too (from 160MB to 1.2GB). If yes, it's a real leak; if it remains much more contained, it's normal CPython behaviour. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset memory leak, very simple case. v0.12 479190812 | |
520136482 | https://github.com/pydata/xarray/issues/3200#issuecomment-520136482 | https://api.github.com/repos/pydata/xarray/issues/3200 | MDEyOklzc3VlQ29tbWVudDUyMDEzNjQ4Mg== | crusaderky 6213168 | 2019-08-10T10:06:07Z | 2019-08-10T10:06:07Z | MEMBER | Hi, xarray doesn't have any global objects that I know of that can cause the leak - I'm willing to bet on the underlying libraries.
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset memory leak, very simple case. v0.12 479190812 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 2