html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3200#issuecomment-530800751,https://api.github.com/repos/pydata/xarray/issues/3200,530800751,MDEyOklzc3VlQ29tbWVudDUzMDgwMDc1MQ==,1262767,2019-09-12T12:24:12Z,2019-09-12T12:36:02Z,NONE,"I have observed a similar memleak (config see below). It occurs for both parameters engine=netcdf4 and engine=h5netcdf. Example for loading a 1.2GB netCDF file: In contrast, the memory is just released with a `del ds` on the object, this is the large memory (2.6GB) - a `ds.close()` has no effect. There is still a ""minor"" memleak remaining (4MB), when a `open_dataset` is called. See the output using the `memory_profiler` package: ```python Line # Mem usage Increment Line Contents ================================================ 31 168.9 MiB 168.9 MiB @profile 32 def load_and_unload_ds(): 33 173.0 MiB 4.2 MiB ds = xr.open_dataset(LFS_DATA_DIR + '/dist2coast_1deg_merged.nc') 34 2645.4 MiB 2472.4 MiB ds.load() 35 2645.4 MiB 0.0 MiB ds.close() 36 173.5 MiB 0.0 MiB del ds ``` - there is **no** difference using `open_dataset(file, engine='h5netcdf')`, the minor memleak is even larger (~9MB). - memory leak persists, if an additional chunks parameter is used for `open_dataset` **Output of `xr.show_versions()`**
INSTALLED VERSIONS ------------------ commit: None python: 3.6.7 | packaged by conda-forge | (default, Jul 2 2019, 02:18:42) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.15.0-62-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.2 xarray: 0.12.3 pandas: 0.25.1 numpy: 1.16.4 scipy: 1.2.1 netCDF4: 1.5.1.2 pydap: None h5netcdf: 0.7.4 h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.3.0 distributed: 2.3.2 matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: None numbagg: None setuptools: 41.0.1 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 7.7.0 sphinx: None
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,479190812 https://github.com/pydata/xarray/issues/3200#issuecomment-520571376,https://api.github.com/repos/pydata/xarray/issues/3200,520571376,MDEyOklzc3VlQ29tbWVudDUyMDU3MTM3Ng==,19933988,2019-08-12T19:56:09Z,2019-08-12T19:56:09Z,NONE,"Awesome, thanks @shoyer and @crusaderky for looking into this. I've tested it with the h5netcdf engine and it the leak is mostly mitigated... for the simple case at least. Unfortunately the actual model files that I'm working with do not appear to be compatible with h5py (I believe related to this issue https://github.com/h5py/h5py/issues/719). But that's another problem entirely! @crusaderky, I will hopefully get to trying your suggestions 3) and 4). As for your last point, I haven't tested explicitly, but yes I believe that it does continue to grow linearly more iterations. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,479190812