html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/463#issuecomment-263467311,https://api.github.com/repos/pydata/xarray/issues/463,263467311,MDEyOklzc3VlQ29tbWVudDI2MzQ2NzMxMQ==,306380,2016-11-29T03:35:43Z,2016-11-29T03:35:43Z,MEMBER,"@shoyer is it ever feasible to read the first NetCDF file in a sequence and
assume that they are all the same except to increment a datetime dimension
by increasing days?

On Mon, Nov 28, 2016 at 7:19 PM, Stephan Hoyer <notifications@github.com>
wrote:

> if I understand correctly, incorporation of the LRU cache could help with
> this problem assuming time series were sliced into small chunks for access,
> correct? We would still run into problems, however, if there were say 10^6
> files and we wanted to get a time-series spanning these files, right?
>
> The LRU cache solution proposed in #798
> <https://github.com/pydata/xarray/issues/798> would work in either case.
> It just would have poor performance when accessing a small piece of each of
> 10^6 files, both to build the graph (because xarray needs to open each file
> to read the metadata) and to do the actual computation (again, because of
> the need to open so many files). If you only need a small amount of data
> from many files, you probably want to reshape your data to minimize the
> amount of necessary file access no matter what, whether you do that
> reshaping with PyReshaper or xarray/dask.array/dask-distributed.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/issues/463#issuecomment-263437709>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AASszK5My19y5DB7i-PBj-0L0-XM8dcKks5rC2-qgaJpZM4FWKen>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,94328498