id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1280507371,I_kwDOAMm_X85MUv3r,6715,`xr.open_rasterio` fails to locate file after being ran 3 times,78166093,closed,0,,,4,2022-06-22T16:39:18Z,2023-02-09T19:38:17Z,2022-07-12T12:32:45Z,NONE,,,,"### What happened? In Docker environments only, throws the below error. This only occurs when trying to read .hdf files with a cumulative total of >32 layers. It always fails on the 33rd layer being read into memory regardless of the order of the files and the contents of the files themselves. Note we use a copy of a file for each iteration and it still fails ```bash rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/ pytest-5/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory ``` ### What did you expect to happen? Up until quite recently, this exact method worked fine and we were able to return a list of xr.Dataset objects, confirmed through past CI/CD tests ### Minimal Complete Verifiable Example Clone a repo with a complete example, following the cmds from the README to spin up a docker container and run `pytest` here: https://github.com/jamie-sgro/xarray-recreate-bug ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [x] Complete example — the example is self-contained, including all data and the text of any traceback. - [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [x] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory self = CachingFileManager(, 'HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance', mode='r', kwargs={}) needs_lock = True def _acquire_with_cache_info(self, needs_lock=True): """"""Acquire a file, returning the file and whether it was cached."""""" with self._optional_lock(needs_lock): try: > file = self._cache[self._key] /usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:199: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = key = [, ('HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance',), 'r', ()] def __getitem__(self, key: K) -> V: # record recent use of the key by moving it to the front of the list with self._lock: > value = self._cache[key] E KeyError: [, ('HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance',), 'r', ()] /usr/local/lib/python3.9/site-packages/xarray/backends/lru_cache.py:53: KeyError During handling of the above exception, another exception occurred: > ??? rasterio/_base.pyx:261: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? rasterio/_shim.pyx:78: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E rasterio._err.CPLE_OpenFailedError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory rasterio/_err.pyx:216: CPLE_OpenFailedError During handling of the above exception, another exception occurred: self = tmp_path = PosixPath('/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0') def test_can_open_hdf4_closer_to_error_replication(self, tmp_path: Path): """"""A Unique bug discovered June 22nd 2022. In Docker environments only, throws the below error. This only occurs when trying to append the output of open_hdf4 to another object, and so far only fails on the 3rd iteration, regardless of the order of the files and the contents of the files themselves. Note we use a copy of a file for each iteration and it still fails rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/ pytest-5/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory """""" filepaths = [ tmp_path / ""file1"", tmp_path / ""file2"", tmp_path / ""file3"", tmp_path / ""file4"", ] shutil.copyfile(FILEPATH, filepaths[0]) shutil.copyfile(FILEPATH, filepaths[1]) shutil.copyfile(FILEPATH, filepaths[2]) shutil.copyfile(FILEPATH, filepaths[3]) rtn = [] for filepath in filepaths: with warnings.catch_warnings(): warnings.simplefilter(""ignore"", category=NotGeoreferencedWarning) with rasterio.open(filepath) as src: layer_names = src.subdatasets layer = [] for x in layer_names: > a = xr.open_rasterio(x) tests/scripts/utils/test_utils.py:72: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/local/lib/python3.9/site-packages/xarray/backends/rasterio_.py:276: in open_rasterio riods = manager.acquire() /usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:181: in acquire file, _ = self._acquire_with_cache_info(needs_lock) /usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:205: in _acquire_with_cache_info file = self._opener(*self._args, **kwargs) /usr/local/lib/python3.9/site-packages/rasterio/env.py:437: in wrapper return f(*args, **kwds) /usr/local/lib/python3.9/site-packages/rasterio/__init__.py:220: in open s = DatasetReader(path, driver=driver, sharing=sharing, **kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory rasterio/_base.pyx:263: RasterioIOError ``` ### Anything else we need to know? - This error did not occur in our CI/CD when I ran the exact same code a month ago. Without changing the source code, but possibly updating the environment via a docker rebuild is when I first discovered this error - This error only occured in my docker environment with xarray 0.18.2 installed. Locally everything worked fine. ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] python-bits: 64 OS: Linux OS-release: 5.10.25-linuxkit machine: aarch64 processor: byteorder: little LC_ALL: en_GB.UTF-8 LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.18.2 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.1 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.1.0 pip: 22.0.4 conda: None pytest: 7.1.2 IPython: None sphinx: 4.5.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6715/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue