html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2857#issuecomment-1010713645,https://api.github.com/repos/pydata/xarray/issues/2857,1010713645,IC_kwDOAMm_X848PkQt,5821660,2022-01-12T07:15:39Z,2022-01-12T07:15:39Z,MEMBER,"This issue is fixed to some extent since `h5netcdf 0.12.0`. `h5netcdf` does not reach the timings of netCDF4 engine, but the improvement is quite significant. | Number of datasets in file | netCDF4 write (ms) | h5netcdf <= 0.11.0 write(ms) | h5netcdf >= 0.12.0 write (ms) | |-----|------|-----|-----| | 1 | 2 | 7 | 7 | | 250 | 104 | 1710 | 164 | The issue can be closed. Ping @aldanor. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-999410802,https://api.github.com/repos/pydata/xarray/issues/2857,999410802,IC_kwDOAMm_X847kcxy,5821660,2021-12-22T09:11:05Z,2021-12-22T09:11:05Z,MEMBER,"FYI: `h5netcdf` has just merged a refactor of the dimension scale handling, which greatly improves the performance here. It will be released in the next version (0.13.0). See https://github.com/h5netcdf/h5netcdf/pull/112 I'll come back if the release is out, so we can close this issue.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-825579825,https://api.github.com/repos/pydata/xarray/issues/2857,825579825,MDEyOklzc3VlQ29tbWVudDgyNTU3OTgyNQ==,5821660,2021-04-23T11:01:04Z,2021-04-23T11:01:04Z,MEMBER,@aldanor Could you please have a look into https://github.com/h5netcdf/h5netcdf/pull/101 for a fix. Any comments are very much appreciated.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-807344131,https://api.github.com/repos/pydata/xarray/issues/2857,807344131,MDEyOklzc3VlQ29tbWVudDgwNzM0NDEzMQ==,5821660,2021-03-25T19:34:55Z,2021-03-25T19:34:55Z,MEMBER,@shoyer Could we move the entire issue? Or just open another one over at 'h5netcdf' and reference this one? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806982015,https://api.github.com/repos/pydata/xarray/issues/2857,806982015,MDEyOklzc3VlQ29tbWVudDgwNjk4MjAxNQ==,5821660,2021-03-25T15:48:35Z,2021-03-25T15:48:35Z,MEMBER,"OK, we might check if that depends on the data size or on the number of groups, or both. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806853536,https://api.github.com/repos/pydata/xarray/issues/2857,806853536,MDEyOklzc3VlQ29tbWVudDgwNjg1MzUzNg==,5821660,2021-03-25T14:29:24Z,2021-03-25T14:29:24Z,MEMBER,"> I wonder if it would help to use the same underlying `h5py.File` or `h5netcdf.File` when appending. This should somehow be possible. I'll try to create some proof of concept script bypassing `to_netcdf`, when I find the time. If there are other ideas or solutions, please comment here. Thanks @aldanor for intensive testing and minimal example. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806825379,https://api.github.com/repos/pydata/xarray/issues/2857,806825379,MDEyOklzc3VlQ29tbWVudDgwNjgyNTM3OQ==,5821660,2021-03-25T14:11:43Z,2021-03-25T14:11:43Z,MEMBER,"From my understanding, part of the the problem is with the use of `CachingFileManager`. Every call to `to_netcdf(filename....)` reopens this particular file (with all the downsides) and wraps it in `CachingFileManager` again. I wonder if it would help to use the same underlying `h5py.File` or `h5netcdf.File` when appending. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806759522,https://api.github.com/repos/pydata/xarray/issues/2857,806759522,MDEyOklzc3VlQ29tbWVudDgwNjc1OTUyMg==,5821660,2021-03-25T13:39:02Z,2021-03-25T13:39:02Z,MEMBER,"@aldanor If I change your example to using `engine=netcdf4`, the times increase too, but not to the extend of the `h5netcdf` case. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806741704,https://api.github.com/repos/pydata/xarray/issues/2857,806741704,MDEyOklzc3VlQ29tbWVudDgwNjc0MTcwNA==,5821660,2021-03-25T13:27:43Z,2021-03-25T13:27:43Z,MEMBER,"@aldanor Thanks, that's what I expected (that the new version doesn't change the behaviour you are showing). I think your assessment of the situation is correct. It looks like, `to_netcdf` is re-reading the whole file when in append-mode. Or better said, the underlying machinery re-reads the complete file. Would it be possible to use engine=`netcdf4` just to see if this isn't affected? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885 https://github.com/pydata/xarray/issues/2857#issuecomment-806697600,https://api.github.com/repos/pydata/xarray/issues/2857,806697600,MDEyOklzc3VlQ29tbWVudDgwNjY5NzYwMA==,5821660,2021-03-25T12:59:11Z,2021-03-25T12:59:11Z,MEMBER,"@aldanor Which `h5netcdf`-version are you using? There have been changes to the `_lookup_dimensions`-function (which should not change behaviour). I'd try to check this out, could you help with a minimal script to reproduce?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,427410885