github: issue_comments: 10 rows where author_association = "MEMBER", issue = 427410885 and user = 5821660 sorted by updated

10 rows where author_association = "MEMBER", issue = 427410885 and user = 5821660 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1010713645	https://github.com/pydata/xarray/issues/2857#issuecomment-1010713645	https://api.github.com/repos/pydata/xarray/issues/2857	IC_kwDOAMm_X848PkQt	kmuehlbauer 5821660	2022-01-12T07:15:39Z	2022-01-12T07:15:39Z	MEMBER	This issue is fixed to some extent since `h5netcdf 0.12.0`. `h5netcdf` does not reach the timings of netCDF4 engine, but the improvement is quite significant. \| Number of datasets in file \| netCDF4 write (ms) \| h5netcdf <= 0.11.0 write(ms) \| h5netcdf >= 0.12.0 write (ms) \| \|-----\|------\|-----\|-----\| \| 1 \| 2 \| 7 \| 7 \| \| 250 \| 104 \| 1710 \| 164 \| The issue can be closed. Ping @aldanor.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
999410802	https://github.com/pydata/xarray/issues/2857#issuecomment-999410802	https://api.github.com/repos/pydata/xarray/issues/2857	IC_kwDOAMm_X847kcxy	kmuehlbauer 5821660	2021-12-22T09:11:05Z	2021-12-22T09:11:05Z	MEMBER	FYI: `h5netcdf` has just merged a refactor of the dimension scale handling, which greatly improves the performance here. It will be released in the next version (0.13.0). See https://github.com/h5netcdf/h5netcdf/pull/112 I'll come back if the release is out, so we can close this issue.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
825579825	https://github.com/pydata/xarray/issues/2857#issuecomment-825579825	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgyNTU3OTgyNQ==	kmuehlbauer 5821660	2021-04-23T11:01:04Z	2021-04-23T11:01:04Z	MEMBER	@aldanor Could you please have a look into https://github.com/h5netcdf/h5netcdf/pull/101 for a fix. Any comments are very much appreciated.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
807344131	https://github.com/pydata/xarray/issues/2857#issuecomment-807344131	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNzM0NDEzMQ==	kmuehlbauer 5821660	2021-03-25T19:34:55Z	2021-03-25T19:34:55Z	MEMBER	@shoyer Could we move the entire issue? Or just open another one over at 'h5netcdf' and reference this one?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806982015	https://github.com/pydata/xarray/issues/2857#issuecomment-806982015	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjk4MjAxNQ==	kmuehlbauer 5821660	2021-03-25T15:48:35Z	2021-03-25T15:48:35Z	MEMBER	OK, we might check if that depends on the data size or on the number of groups, or both.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806853536	https://github.com/pydata/xarray/issues/2857#issuecomment-806853536	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjg1MzUzNg==	kmuehlbauer 5821660	2021-03-25T14:29:24Z	2021-03-25T14:29:24Z	MEMBER	I wonder if it would help to use the same underlying `h5py.File` or `h5netcdf.File` when appending. This should somehow be possible. I'll try to create some proof of concept script bypassing `to_netcdf`, when I find the time. If there are other ideas or solutions, please comment here. Thanks @aldanor for intensive testing and minimal example.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806825379	https://github.com/pydata/xarray/issues/2857#issuecomment-806825379	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjgyNTM3OQ==	kmuehlbauer 5821660	2021-03-25T14:11:43Z	2021-03-25T14:11:43Z	MEMBER	From my understanding, part of the the problem is with the use of `CachingFileManager`. Every call to `to_netcdf(filename....)` reopens this particular file (with all the downsides) and wraps it in `CachingFileManager` again. I wonder if it would help to use the same underlying `h5py.File` or `h5netcdf.File` when appending.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806759522	https://github.com/pydata/xarray/issues/2857#issuecomment-806759522	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjc1OTUyMg==	kmuehlbauer 5821660	2021-03-25T13:39:02Z	2021-03-25T13:39:02Z	MEMBER	@aldanor If I change your example to using `engine=netcdf4`, the times increase too, but not to the extend of the `h5netcdf` case.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806741704	https://github.com/pydata/xarray/issues/2857#issuecomment-806741704	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjc0MTcwNA==	kmuehlbauer 5821660	2021-03-25T13:27:43Z	2021-03-25T13:27:43Z	MEMBER	@aldanor Thanks, that's what I expected (that the new version doesn't change the behaviour you are showing). I think your assessment of the situation is correct. It looks like, `to_netcdf` is re-reading the whole file when in append-mode. Or better said, the underlying machinery re-reads the complete file. Would it be possible to use engine=`netcdf4` just to see if this isn't affected?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885
806697600	https://github.com/pydata/xarray/issues/2857#issuecomment-806697600	https://api.github.com/repos/pydata/xarray/issues/2857	MDEyOklzc3VlQ29tbWVudDgwNjY5NzYwMA==	kmuehlbauer 5821660	2021-03-25T12:59:11Z	2021-03-25T12:59:11Z	MEMBER	@aldanor Which `h5netcdf`-version are you using? There have been changes to the `_lookup_dimensions`-function (which should not change behaviour). I'd try to check this out, could you help with a minimal script to reproduce?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Quadratic slowdown when saving multiple datasets to the same h5 file (h5netcdf) 427410885

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);