issue_comments: 773820054

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2912#issuecomment-773820054	https://api.github.com/repos/pydata/xarray/issues/2912	773820054	MDEyOklzc3VlQ29tbWVudDc3MzgyMDA1NA==	60338532	2021-02-05T06:20:40Z	2021-02-05T06:56:05Z	NONE	I am trying to perform a fairly simplistic operation on a dataset involving editing of variable and global attributes on individual netcdf files of 3.5GB each. The files load instantly using `xr.open_dataset` but `dataset.to_netcdf()` is too slow to export after the modifications. I have tried : 1. Without rechunking and dask invocations. 2. Varying chunk sizes followed by : 3. Using`load()`before `to_netcdf` 4. Using `persist()` or `compute ()` before `to_netcdf` I am working on a HPC with 10 distributed workers . In all cases, the time taken is more than 15 minutes per file. Is it expected? What else can I try to speed up this process apart from further parallelizing the single file operations using dask delayed?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		435535284