issues: 479190812

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
479190812	MDU6SXNzdWU0NzkxOTA4MTI=	3200	open_mfdataset memory leak, very simple case. v0.12	19933988	open	0			7	2019-08-09T22:38:39Z	2023-02-03T22:58:32Z		NONE				MCVE Code Sample ```python import glob import xarray as xr import numpy as np from memory_profiler import profile def CreateTestFiles(): # create a bunch of files xlen = int(1e2) ylen = int(1e2) xdim = np.arange(xlen) ydim = np.arange(ylen) `nfiles = 100 for i in range(nfiles): data = np.random.rand(xlen, ylen, 1) datafile = xr.DataArray(data, coords=[xdim, ydim, i], dims=['x', 'y', 'time']) datafile.to_netcdf('testfiles/datafile_{}.nc'.format(i))` @profile def ReadFiles(): xr.open_mfdataset(glob.glob('testfiles/'), concat_dim='time') if name* == 'main': # write out files for testing CreateTestFiles() `# loop thru file read step for i in range(100): ReadFiles()` ~ ~ ``` usage: mprof run simplest_case.py mprof plot (mprof is a python memory profiling library) Problem Description dask version 1.1.4 xarray version 0.12 python 3.7.3 There appears to be a persistent memory leak in open_mfdataset. I'm creating a model calibration script that runs for ~1000 iterations, opening and closing the same set of files (dimensions are the same, but the data is different) with each iteration. I eventually run out of memory because of the leak. This simple case captures the same behavior. Closing the files with .close() does not fix the problem. Is there a work around for this? I've perused some of the issues but cannot tell if this has been resolved. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.17.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.0 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: 1.5.5 zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.4 distributed: 1.26.0 matplotlib: 3.0.2 cartopy: 0.17.0 seaborn: None setuptools: 41.0.1 pip: 19.1.1 conda: None pytest: None IPython: 7.3.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/3200/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

2 rows from issues_id in issues_labels
7 rows from issue in issue_comments