issue_comments: 305506896

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/798#issuecomment-305506896	https://api.github.com/repos/pydata/xarray/issues/798	305506896	MDEyOklzc3VlQ29tbWVudDMwNTUwNjg5Ng==	306380	2017-06-01T14:17:11Z	2017-06-01T14:17:11Z	MEMBER	@shoyer regarding per-file locking this probably only matters if we are writing as well, yes? Here is a small implementation of a generic file-open cache. I haven't yet decided on a eviction policy but either LRU or random (filtered by closeable files) should work OK. ```python from contextlib import contextmanager import threading class OpenCache(object): def init(self, maxsize=100): self.refcount = defaultdict(lambda: 0) self.maxsize = 0 self.cache = {} self.i = 0 self.lock = threading.Lock() `@contextmanager def open(self, myopen, fn, mode='r'): assert 'r' in mode key = (myopen, fn, mode) with self.lock: try: file = self.cache[key] except KeyError: file = myopen(fn, mode=mode) self.cache[key] = file self.refcount[key] += 1 if len(self.cache) > self.maxsize: # Clear old files intelligently try: yield file finally: with self.lock: self.refcount[key] -= 1` cache = OpenCache() with cache.open(h5py.File, 'myfile.hdf5') as f: x = f['/data/x'] y = x[:1000, :1000] ``` Is this still useful? I'm curious to hear from users like @pwolfram and @rabernat who may be running into the many file problem about what the current pain points are.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		142498006