id,node_id,number,state,locked,title,user,body,created_at,updated_at,closed_at,merged_at,merge_commit_sha,assignee,milestone,draft,head,base,author_association,auto_merge,repo,url,merged_by 1714393522,PR_kwDOAMm_X85mL5Gy,8716,closed,0,Add lru_cache to module_available,32731672,"Our application creates many small netcdf3 files: https://github.com/equinor/ert/blob/9c2b60099a54eeb5bb40013acef721e30558a86c/src/ert/storage/local_ensemble.py#L593 . A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask which takes about 0.3 ms. This time becomes significant in the case of many small files. This PR uses lru_cache to avoid rechecking for the presence of dask as it should not change for the lifetime of the application. - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ",2024-02-06T20:00:19Z,2024-02-07T14:50:19Z,2024-02-07T14:50:19Z,,77e990f80653b6c16a9095b8ec6dd5fbc520b778,,,0,c0b5ef544c61c49fe4672ff5e24efef60fc999ce,f33a632bf87ec29dd9346f9b01ad4eec2194f72a,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/8716, 1715744126,PR_kwDOAMm_X85mRC1-,8717,closed,0,Add lru_cache to named_array.utils.module_available and core.utils.module_available,32731672,"Our application creates many small netcdf3 files: https://github.com/equinor/ert/blob/9c2b60099a54eeb5bb40013acef721e30558a86c/src/ert/storage/local_ensemble.py#L593 . A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask which takes about 0.3 ms. This time becomes significant in the case of many small files. This PR uses lru_cache to avoid rechecking for the presence of dask as it should not change for the lifetime of the application. In one stress test we called dataset.py:2201(to_netcdf) 13634 times which took 82.27 seconds, of which 46.8 seconds was spent on utils.py:1162(module_available). With the change in this PR, the same test spends only 50s on to_netcdf . Generally, under normal load, a session in our application will call to_netcdf ~1000 times, but 10 000 happens. - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` ",2024-02-07T14:01:35Z,2024-02-26T11:23:04Z,2024-02-07T16:26:12Z,2024-02-07T16:26:12Z,0f7a0342ce3dea9a011543469372ad782ec4aba2,,,0,e004bc3e3583f037133d54ddf0f800a306333c52,f33a632bf87ec29dd9346f9b01ad4eec2194f72a,CONTRIBUTOR,,13221727,https://github.com/pydata/xarray/pull/8717,