issue_comments: 263437709

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/463#issuecomment-263437709	https://api.github.com/repos/pydata/xarray/issues/463	263437709	MDEyOklzc3VlQ29tbWVudDI2MzQzNzcwOQ==	1217238	2016-11-29T00:19:53Z	2016-11-29T00:19:53Z	MEMBER	if I understand correctly, incorporation of the LRU cache could help with this problem assuming time series were sliced into small chunks for access, correct? We would still run into problems, however, if there were say 10^6 files and we wanted to get a time-series spanning these files, right? The LRU cache solution proposed in https://github.com/pydata/xarray/issues/798 would work in either case. It just would have poor performance when accessing a small piece of each of 10^6 files, both to build the graph (because xarray needs to open each file to read the metadata) and to do the actual computation (again, because of the need to open so many files). If you only need a small amount of data from many files, you probably want to reshape your data to minimize the amount of necessary file access no matter what, whether you do that reshaping with PyReshaper or xarray/dask.array/dask-distributed.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		94328498