home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1280507371

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1280507371 I_kwDOAMm_X85MUv3r 6715 `xr.open_rasterio` fails to locate file after being ran 3 times 78166093 closed 0     4 2022-06-22T16:39:18Z 2023-02-09T19:38:17Z 2022-07-12T12:32:45Z NONE      

What happened?

In Docker environments only, throws the below error. This only occurs when trying to read .hdf files with a cumulative total of >32 layers. It always fails on the 33rd layer being read into memory regardless of the order of the files and the contents of the files themselves. Note we use a copy of a file for each iteration and it still fails

bash rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/ pytest-5/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory

What did you expect to happen?

Up until quite recently, this exact method worked fine and we were able to return a list of xr.Dataset objects, confirmed through past CI/CD tests

Minimal Complete Verifiable Example

Clone a repo with a complete example, following the cmds from the README to spin up a docker container and run pytest here: https://github.com/jamie-sgro/xarray-recreate-bug

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory self = CachingFileManager(<function open at 0xffffa0f5fd30>, 'HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance', mode='r', kwargs={}) needs_lock = True

def _acquire_with_cache_info(self, needs_lock=True):
    """Acquire a file, returning the file and whether it was cached."""
    with self._optional_lock(needs_lock):
        try:
          file = self._cache[self._key]

/usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:199:


self = <xarray.backends.lru_cache.LRUCache object at 0xffffad587e80> key = [<function open at 0xffffa0f5fd30>, ('HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance',), 'r', ()]

def __getitem__(self, key: K) -> V:
    # record recent use of the key by moving it to the front of the list
    with self._lock:
      value = self._cache[key]

E KeyError: [<function open at 0xffffa0f5fd30>, ('HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance',), 'r', ()]

/usr/local/lib/python3.9/site-packages/xarray/backends/lru_cache.py:53: KeyError

During handling of the above exception, another exception occurred:

???

rasterio/_base.pyx:261:


???

rasterio/_shim.pyx:78:


??? E rasterio._err.CPLE_OpenFailedError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory

rasterio/_err.pyx:216: CPLE_OpenFailedError

During handling of the above exception, another exception occurred:

self = <test_utils.TestUtils object at 0xffff9db81310> tmp_path = PosixPath('/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0')

def test_can_open_hdf4_closer_to_error_replication(self, tmp_path: Path):
    """A Unique bug discovered June 22nd 2022.
    In Docker environments only, throws the below error. This only occurs
    when trying to append the output of open_hdf4 to another object, and so
    far only fails on the 3rd iteration, regardless of the order of the files
    and the contents of the files themselves. Note we use a copy of a file
    for each iteration and it still fails


    rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/
    pytest-5/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1
    km 16 days blue reflectance: No such file or directory

    """

    filepaths = [
        tmp_path / "file1",
        tmp_path / "file2",
        tmp_path / "file3",
        tmp_path / "file4",
    ]

    shutil.copyfile(FILEPATH, filepaths[0])
    shutil.copyfile(FILEPATH, filepaths[1])
    shutil.copyfile(FILEPATH, filepaths[2])
    shutil.copyfile(FILEPATH, filepaths[3])

    rtn = []
    for filepath in filepaths:
        with warnings.catch_warnings():
            warnings.simplefilter("ignore", category=NotGeoreferencedWarning)
            with rasterio.open(filepath) as src:
                layer_names = src.subdatasets
        layer = []
        for x in layer_names:
          a = xr.open_rasterio(x)

tests/scripts/utils/test_utils.py:72:


/usr/local/lib/python3.9/site-packages/xarray/backends/rasterio_.py:276: in open_rasterio riods = manager.acquire() /usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:181: in acquire file, _ = self._acquire_with_cache_info(needs_lock) /usr/local/lib/python3.9/site-packages/xarray/backends/file_manager.py:205: in _acquire_with_cache_info file = self._opener(self._args, kwargs) /usr/local/lib/python3.9/site-packages/rasterio/env.py:437: in wrapper return f(args, kwds) /usr/local/lib/python3.9/site-packages/rasterio/init.py:220: in open s = DatasetReader(path, driver=driver, sharing=sharing, kwargs)


??? E rasterio.errors.RasterioIOError: HDF4_EOS:EOS_GRID:/tmp/pytest-of-root/pytest-27/test_can_open_hdf4_closer_to_e0/file3:MODIS_Grid_16DAY_1km_VI:1 km 16 days blue reflectance: No such file or directory

rasterio/_base.pyx:263: RasterioIOError ```

Anything else we need to know?

  • This error did not occur in our CI/CD when I ran the exact same code a month ago. Without changing the source code, but possibly updating the environment via a docker rebuild is when I first discovered this error
  • This error only occured in my docker environment with xarray 0.18.2 installed. Locally everything worked fine.

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] python-bits: 64 OS: Linux OS-release: 5.10.25-linuxkit machine: aarch64 processor: byteorder: little LC_ALL: en_GB.UTF-8 LANG: en_GB.UTF-8 LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.18.2 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.1 netCDF4: 1.5.8 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.6.0 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.10 cfgrib: None iris: None bottleneck: None dask: None distributed: None matplotlib: 3.5.2 cartopy: None seaborn: None numbagg: None pint: None setuptools: 58.1.0 pip: 22.0.4 conda: None pytest: 7.1.2 IPython: None sphinx: 4.5.0
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6715/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.832ms · About: xarray-datasette