id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 479190812,MDU6SXNzdWU0NzkxOTA4MTI=,3200,"open_mfdataset memory leak, very simple case. v0.12",19933988,open,0,,,7,2019-08-09T22:38:39Z,2023-02-03T22:58:32Z,,NONE,,,,"#### MCVE Code Sample ```python import glob import xarray as xr import numpy as np from memory_profiler import profile def CreateTestFiles(): # create a bunch of files xlen = int(1e2) ylen = int(1e2) xdim = np.arange(xlen) ydim = np.arange(ylen) nfiles = 100 for i in range(nfiles): data = np.random.rand(xlen, ylen, 1) datafile = xr.DataArray(data, coords=[xdim, ydim, i], dims=['x', 'y', 'time']) datafile.to_netcdf('testfiles/datafile_{}.nc'.format(i)) @profile def ReadFiles(): xr.open_mfdataset(glob.glob('testfiles/*'), concat_dim='time') if __name__ == '__main__': # write out files for testing CreateTestFiles() # loop thru file read step for i in range(100): ReadFiles() ~ ~ ``` #### usage: mprof run simplest_case.py mprof plot (mprof is a python memory profiling library) #### Problem Description dask version 1.1.4 xarray version 0.12 python 3.7.3 There appears to be a persistent memory leak in open_mfdataset. I'm creating a model calibration script that runs for ~1000 iterations, opening and closing the same set of files (dimensions are the same, but the data is different) with each iteration. I eventually run out of memory because of the leak. This simple case captures the same behavior. Closing the files with .close() does not fix the problem. Is there a work around for this? I've perused some of the issues but cannot tell if this has been resolved. ![Figure_1](https://user-images.githubusercontent.com/19933988/62812349-8b008680-bac2-11e9-9896-f089f7ba2deb.png) #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.17.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.0 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: 1.5.5 zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.4 distributed: 1.26.0 matplotlib: 3.0.2 cartopy: 0.17.0 seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: None IPython: 7.3.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3200/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue