issue_comments: 980840643

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/6033#issuecomment-980840643	https://api.github.com/repos/pydata/xarray/issues/6033	980840643	IC_kwDOAMm_X846dnDD	5509356	2021-11-28T04:57:48Z	2021-11-28T04:57:48Z	NONE	@max-sixty Okay, yeah, that's the problem, it's re-downloading the data every time the values are accessed. Apparently this is the default behavior given that zarr is a chunked format. Adding `cache=True`: - Fixes the problem in open_dataset - Throws an error in open_zarr - Doesn't have any noticeable effect in open_mfdataset My data archive can't normally be usefully read without open_mfdataset and it's small enough to easily fit in memory so this behavior isn't ideal. I guess I had assumed that the data would get stored on disk temporarily even if it wasn't in memory, too, so it's an unexpected limitation that the choices are to either cache it in memory or re-read from S3 every time you access the data. It also seems odd that the default caching logic just takes into account whether the data is chunked, not how big (small) it is, how slow accessing the store is, or whether the data's being repeatedly accessed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		1064837571