github: issue_comments: 1 row where issue = 435535284 and user = 60338532 sorted by updated

1 row where issue = 435535284 and user = 60338532 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	performed_via_github_app	issue
773820054	https://github.com/pydata/xarray/issues/2912#issuecomment-773820054	https://api.github.com/repos/pydata/xarray/issues/2912	MDEyOklzc3VlQ29tbWVudDc3MzgyMDA1NA==	bhanu-magotra 60338532	2021-02-05T06:20:40Z	2021-02-05T06:56:05Z	NONE	I am trying to perform a fairly simplistic operation on a dataset involving editing of variable and global attributes on individual netcdf files of 3.5GB each. The files load instantly using `xr.open_dataset` but `dataset.to_netcdf()` is too slow to export after the modifications. I have tried : 1. Without rechunking and dask invocations. 2. Varying chunk sizes followed by : 3. Using`load()`before `to_netcdf` 4. Using `persist()` or `compute ()` before `to_netcdf` I am working on a HPC with 10 distributed workers . In all cases, the time taken is more than 15 minutes per file. Is it expected? What else can I try to speed up this process apart from further parallelizing the single file operations using dask delayed?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		Writing a netCDF file is unexpectedly slow 435535284

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);