issues: 1806386948
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1806386948 | I_kwDOAMm_X85rq0cE | 7990 | Random crashes in netcdf when dask client has multiple threads | 40218891 | closed | 0 | 1 | 2023-07-16T01:00:55Z | 2023-08-23T00:18:18Z | 2023-08-23T00:18:17Z | NONE | What happened?The data files can be found here: https://noaadata.apps.nsidc.org/NOAA/G02202_V4/north/monthly/. The example code below crashes randomly: the file processed when the crash occurs differs between runs. This happens only when What did you expect to happen?No response Minimal Complete Verifiable Example```Python from pathlib import Path import pandas as pd from dask.distributed import Client import xarray as xr client = Client(n_workers=1, threads_per_worker=4) DATADIR = Path("/mnt/sdc1/icec/NSIDC") year = 2020 times = pd.date_range(f"{year}-01-01", f"{year}-12-01", freq="MS", name="time") paths = [ DATADIR / "monthly" / f"seaice_conc_monthly_nh_{t.strftime('%Y%m')}_f17_v04r00.nc" for t in times ] for n in range(10): ds = xr.open_mfdataset( paths, combine="nested", concat_dim="tdim", parallel=True, engine="netcdf4", ) del ds HDF5-DIAG: Error detected in HDF5 (1.14.0) thread 0: #000: H5G.c line 442 in H5Gopen2(): unable to synchronously open group major: Symbol table minor: Unable to create file #001: H5G.c line 399 in H5G__open_api_common(): can't set object access arguments major: Symbol table minor: Can't set value #002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifier major: Invalid arguments to routine minor: Inappropriate type #003: H5VLint.c line 1787 in H5VL_vol_object(): invalid identifier type to function major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.14.0) thread 0: #000: H5G.c line 887 in H5Gclose(): not a group ID major: Invalid arguments to routine minor: Inappropriate type 2023-07-16 00:35:47,833 - distributed.worker - WARNING - Compute Failed Key: open_dataset-09a155bb-5079-406a-83c4-737933c409c7 Function: execute_task args: ((<function apply at 0x7f0001edf520>, <function open_dataset at 0x7effe3e35c60>, ['/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202001_f17_v04r00.nc'], (<class 'dict'>, [['engine', 'netcdf4'], ['chunks', (<class 'dict'>, [])]]))) kwargs: {} Exception: "OSError(-101, 'NetCDF: HDF error')" 2023-07-16 00:35:47,834 - distributed.worker - WARNING - Compute Failed Key: open_dataset-14e239f4-7e16-4891-a350-b55979d4a754 Function: execute_task args: ((<function apply at 0x7f0001edf520>, <function open_dataset at 0x7effe3e35c60>, ['/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202011_f17_v04r00.nc'], (<class 'dict'>, [['engine', 'netcdf4'], ['chunks', (<class 'dict'>, [])]]))) kwargs: {} Exception: "OSError(-101, 'NetCDF: HDF error')" OSError Traceback (most recent call last) Cell In[1], line 19 14 paths = [ 15 DATADIR / "monthly" / f"seaice_conc_monthly_nh_{t.strftime('%Y%m')}_f17_v04r00.nc" 16 for t in times 17 ] 18 for n in range(10): ---> 19 ds = xr.open_mfdataset( 20 paths, 21 combine="nested", 22 concat_dim="tdim", 23 parallel=True, 24 engine="netcdf4", 25 ) 26 del ds File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/api.py:1050, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, **kwargs) 1045 datasets = [preprocess(ds) for ds in datasets] 1047 if parallel: 1048 # calling compute here will return the datasets/file_objs lists, 1049 # the underlying datasets will still be stored as dask arrays -> 1050 datasets, closers = dask.compute(datasets, closers) 1052 # Combine all datasets, closing them in case of a ValueError 1053 try: File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/api.py:570, in open_dataset() 558 decoders = _resolve_decoders_kwargs( 559 decode_cf, 560 open_backend_dataset_parameters=backend.open_dataset_parameters, (...) 566 decode_coords=decode_coords, 567 ) 569 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None) --> 570 backend_ds = backend.open_dataset( 571 filename_or_obj, 572 drop_variables=drop_variables, 573 decoders, 574 kwargs, 575 ) 576 ds = _dataset_from_backend_dataset( 577 backend_ds, 578 filename_or_obj, (...) 588 **kwargs, 589 ) 590 return ds File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:590, in open_dataset() 569 def open_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs 570 self, 571 filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore, (...) 587 autoclose=False, 588 ) -> Dataset: 589 filename_or_obj = _normalize_path(filename_or_obj) --> 590 store = NetCDF4DataStore.open( 591 filename_or_obj, 592 mode=mode, 593 format=format, 594 group=group, 595 clobber=clobber, 596 diskless=diskless, 597 persist=persist, 598 lock=lock, 599 autoclose=autoclose, 600 ) 602 store_entrypoint = StoreBackendEntrypoint() 603 with close_on_error(store): File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:391, in open() 385 kwargs = dict( 386 clobber=clobber, diskless=diskless, persist=persist, format=format 387 ) 388 manager = CachingFileManager( 389 netCDF4.Dataset, filename, mode=mode, kwargs=kwargs 390 ) --> 391 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose) File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:338, in init() 336 self._group = group 337 self._mode = mode --> 338 self.format = self.ds.data_model 339 self._filename = self.ds.filepath() 340 self.is_remote = is_remote_uri(self._filename) File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:400, in ds() 398 @property 399 def ds(self): --> 400 return self._acquire() File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:394, in _acquire() 393 def _acquire(self, needs_lock=True): --> 394 with self._manager.acquire_context(needs_lock) as root: 395 ds = _nc4_require_group(root, self._group, self._mode) 396 return ds File ~/mambaforge/envs/icec/lib/python3.10/contextlib.py:135, in enter() 133 del self.args, self.kwds, self.func 134 try: --> 135 return next(self.gen) 136 except StopIteration: 137 raise RuntimeError("generator didn't yield") from None File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/file_manager.py:199, in acquire_context() 196 @contextlib.contextmanager 197 def acquire_context(self, needs_lock=True): 198 """Context manager for acquiring a file.""" --> 199 file, cached = self._acquire_with_cache_info(needs_lock) 200 try: 201 yield file File ~/mambaforge/envs/icec/lib/python3.10/site-packages/xarray/backends/file_manager.py:217, in _acquire_with_cache_info() 215 kwargs = kwargs.copy() 216 kwargs["mode"] = self._mode --> 217 file = self._opener(self._args, *kwargs) 218 if self._mode == "w": 219 # ensure file doesn't get overridden when opened again 220 self._mode = "a" File src/netCDF4/_netCDF4.pyx:2464, in netCDF4._netCDF4.Dataset.init() File src/netCDF4/_netCDF4.pyx:2027, in netCDF4._netCDF4._ensure_nc_success() OSError: [Errno -101] NetCDF: HDF error: '/mnt/sdc1/icec/NSIDC/monthly/seaice_conc_monthly_nh_202011_f17_v04r00.nc' ``` MVCE confirmation
Relevant log outputNo response Anything else we need to know?No response Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 6.1.38-1-MANJARO
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.0
libnetcdf: 4.9.2
xarray: 2023.6.0
pandas: 2.0.3
numpy: 1.24.4
scipy: 1.11.1
netCDF4: 1.6.4
pydap: None
h5netcdf: None
h5py: 3.9.0
Nio: None
zarr: 2.15.0
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: None
dask: 2023.7.0
distributed: 2023.7.0
matplotlib: 3.7.1
cartopy: 0.21.1
seaborn: None
numbagg: None
fsspec: 2023.6.0
cupy: None
pint: None
sparse: 0.14.0
flox: None
numpy_groupies: None
setuptools: 68.0.0
pip: 23.2
conda: None
pytest: None
mypy: None
IPython: 8.14.0
sphinx: None
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7990/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |