home / github / pull_requests

Menu
  • GraphQL API
  • Search all tables

pull_requests: 1715744126

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
1715744126 PR_kwDOAMm_X85mRC1- 8717 closed 0 Add lru_cache to named_array.utils.module_available and core.utils.module_available 32731672 Our application creates many small netcdf3 files: https://github.com/equinor/ert/blob/9c2b60099a54eeb5bb40013acef721e30558a86c/src/ert/storage/local_ensemble.py#L593 . A significant time in xarray.backends.common.py:AbstractWriteableDataStore.set_variables is spent on common.py:is_dask_collection as it checks for the presence of the module dask which takes about 0.3 ms. This time becomes significant in the case of many small files. This PR uses lru_cache to avoid rechecking for the presence of dask as it should not change for the lifetime of the application. In one stress test we called dataset.py:2201(to_netcdf) 13634 times which took 82.27 seconds, of which 46.8 seconds was spent on utils.py:1162(module_available). With the change in this PR, the same test spends only 50s on to_netcdf . Generally, under normal load, a session in our application will call to_netcdf ~1000 times, but 10 000 happens. - [ ] Closes #xxxx - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed in `api.rst` 2024-02-07T14:01:35Z 2024-02-26T11:23:04Z 2024-02-07T16:26:12Z 2024-02-07T16:26:12Z 0f7a0342ce3dea9a011543469372ad782ec4aba2     0 e004bc3e3583f037133d54ddf0f800a306333c52 f33a632bf87ec29dd9346f9b01ad4eec2194f72a CONTRIBUTOR   13221727 https://github.com/pydata/xarray/pull/8717  

Links from other tables

  • 1 row from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 0.613ms