pull_requests: 1715744126

This data as json

id	node_id	number	state	locked	title	user	body	created_at	updated_at	closed_at	merged_at	merge_commit_sha	assignee	milestone	draft	head	base	author_association	auto_merge	repo	url	merged_by
1715744126	PR_kwDOAMm_X85mRC1-	8717	closed	0	Add lru_cache to named_array.utils.module_available and core.utils.module_available	32731672	Our application creates many small netcdf3 files: https://github.com/equinor/ert/blob/9c2b60099a54eeb5bb40013acef721e30558a86c/src/ert/storage/local_ensemble.py#L593 . A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask which takes about 0.3 ms. This time becomes significant in the case of many small files. This PR uses lru_cache to avoid rechecking for the presence of dask as it should not change for the lifetime of the application. In one stress test we called dataset.py:2201(to_netcdf) 13634 times which took 82.27 seconds, of which 46.8 seconds was spent on utils.py:1162(module_available). With the change in this PR, the same test spends only 50s on to_netcdf . Generally, under normal load, a session in our application will call to_netcdf ~1000 times, but 10 000 happens. - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst`	2024-02-07T14:01:35Z	2024-02-26T11:23:04Z	2024-02-07T16:26:12Z	2024-02-07T16:26:12Z	0f7a0342ce3dea9a011543469372ad782ec4aba2			0	e004bc3e3583f037133d54ddf0f800a306333c52	f33a632bf87ec29dd9346f9b01ad4eec2194f72a	CONTRIBUTOR		13221727	https://github.com/pydata/xarray/pull/8717

Links from other tables

1 row from pull_requests_id in labels_pull_requests