issue_comments: 1450841385

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/7522#issuecomment-1450841385	https://api.github.com/repos/pydata/xarray/issues/7522	1450841385	IC_kwDOAMm_X85WehUp	39069044	2023-03-01T21:01:48Z	2023-03-01T21:01:48Z	CONTRIBUTOR	Yeah that seems to be it. Dask's write neatly packs all the needed metadata at the beginning of the file, since we can scale this up to a many GB file with dozens of variables and still read in ~100ms. While xarray is doing a less well organized write of the metadata and we have to go seeking in the middle of the byte range. `cache_type="first"` does provide some improvement but still not as good as on the dask-written file. FWIW, I inspected the actual bytes of the dask and xarray written files and they are identical for a single variable, but diverge when multiple variables are being written. So, the important differences are probably associated with this step: It does set up the whole set of variables as a initialisation stage before writing any data - I don't know if xarray does this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		1581046647