github: issues: 1 row where state = "open", type = "issue" and user = 8291800 sorted by updated

1 row where state = "open", type = "issue" and user = 8291800 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
1056881922	I_kwDOAMm_X84-_r0C	6000	Parallel access to DataArray within `with` statement causes `BlockingIOError`	scottstanie 8291800	open	0			2	2021-11-18T03:06:26Z	2022-01-13T03:08:02Z		CONTRIBUTOR				What happened: My general usage is 1. Read one DataArray from an existing dataset within a `with` statement, so the file closes at the end 2. Run it through some functions, sometimes in parallel 3. after closing the dataset, append to the dataset in a new DataArray with `.to_netcdf(existing_file, engine="h5netcdf")`. With the setup below, I get `BlockingIOError: [Errno 11] Unable to create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')` It's entirely possible (and likely) that it's an issue with some other library that I'm using/that xarray is using... but I thought I someone might have an idea why a very similar versions of the same script succeeds, while the first one fails. What you expected to happen: No error, which happens for the 2nd version Minimal Complete Verifiable Example: For this version, I'm using pymp, which I'd rather not include in the MCVE, but i've had similar issues jsut using the python multiprocessing. I just wanted to post this one first. ```python import xarray as xr import numpy as np import pymp def dummy_function_parallel(stack): out = np.zeros(stack.shape, dtype=np.float32) # Also fails: # out = pymp.shared.array(stack.shape, dtype=np.float32) with pymp.Parallel(4) as p: for i in p.range(3): out[:, i] = stack[:, i] * 3 return out Example of a fucntion that doesn't cause a failure ever def dummy_function2(stack): return 2 * stack if name == "main": x, y, z = np.arange(3), np.arange(3), np.arange(3) data = np.random.rand(3, 3, 3) da = xr.DataArray(data, dims=["z", "y", "x"], coords={"x": x, "y": y, "z": z}) da.to_dataset(name="testdata").to_netcdf("testdata.nc", engine="h5netcdf") `with xr.open_dataset("testdata.nc") as ds: da = ds["testdata"] newstack = dummy_function_parallel(da.values) # This function does work without the parallel stuff # newstack = dummy_function2(da.values) da_new = xr.DataArray(newstack, coords=da.coords, dims=da.dims) da_new.to_dataset(name="new_testdata").to_netcdf("testdata.nc", engine="h5netcdf")` ``` Running this causes the following traceback ```python-traceback Traceback (most recent call last): File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/file_manager.py", line 199, in _acquire_with_cache_info file = self._cache[self._key] File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/lru_cache.py", line 53, in getitem value = self._cache[key] KeyError: [<class 'h5netcdf.core.File'>, ('/data4/scott/path85/stitched/top_strip/igrams/testdata.nc',), 'a', (('decode_vlen_strings', True), ('invalid_netcdf', None))] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/scott/repos/insar/insar/testxr.py", line 46, in <module> da_new.to_dataset(name="new_testdata").to_netcdf("testdata.nc", engine="h5netcdf") File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/core/dataset.py", line 1900, in to_netcdf return to_netcdf( File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/api.py", line 1060, in to_netcdf store = store_open(target, mode, format, group, kwargs) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/h5netcdf_.py", line 178, in open return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/h5netcdf_.py", line 123, in init self.filename = find_root_and_group(self.ds)[0].filename File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/h5netcdf.py", line 189, in ds return self.acquire() File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/h5netcdf.py", line 181, in _acquire with self._manager.acquire_context(needs_lock) as root: File "/home/scott/miniconda3/envs/mapping/lib/python3.8/contextlib.py", line 113, in enter return next(self.gen) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/file_manager.py", line 187, in acquire_context file, cached = self._acquire_with_cache_info(needs_lock) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/xarray/backends/file_manager.py", line 205, in _acquire_with_cache_info file = self._opener(self._args, kwargs) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/h5netcdf/core.py", line 712, in init* self._h5file = h5py.File(path, mode, kwargs) File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/h5py/_hl/files.py", line 442, in init fid = make_fid(name, mode, userblock_size, File "/home/scott/miniconda3/envs/mapping/lib/python3.8/site-packages/h5py/_hl/files.py", line 201, in make_fid fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 116, in h5py.h5f.create BlockingIOError: [Errno 11] Unable to create file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable') ``` Anything else we need to know?: The weird part to me: If I change the end of the script so that the function runs after the `with` statement exits (so I'm passing the DataArray reference from a closed dataset), there's never an error. Indeed, that's how I fixed this for my real, longer script. `python with xr.open_dataset("testdata.nc") as ds: da = ds["testdata"] # Now after it closed, this parallel function doesn't cause the hangup newstack = dummy_function_parallel(da.values) da_new = xr.DataArray(newstack, coords=da.coords, dims=da.dims) da_new.to_dataset(name="new_testdata").to_netcdf("testdata.nc", engine="h5netcdf")` Environment**: Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1062.4.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.10.6 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.1.0 numpy: 1.19.2 scipy: 1.5.3 netCDF4: 1.5.4 pydap: None h5netcdf: 0.11.0 h5py: 3.2.1 Nio: None zarr: 2.8.3 cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.2.6 cfgrib: 0.9.8.5 iris: None bottleneck: 1.3.2 dask: 2021.01.0 distributed: 2.20.0 matplotlib: 3.3.1 cartopy: 0.19.0.post1 seaborn: None numbagg: None pint: 0.17 setuptools: 50.3.2 pip: 21.2.4 conda: 4.8.4 pytest: 6.2.4 IPython: 7.18.1 sphinx: 4.0.2	{ "url": "https://api.github.com/repos/pydata/xarray/issues/6000/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			xarray 13221727	issue

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

1 row where state = "open", type = "issue" and user = 8291800 sorted by updated_at descending

Example of a fucntion that doesn't cause a failure ever

Advanced export