issue_comments: 116165986

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/444#issuecomment-116165986	https://api.github.com/repos/pydata/xarray/issues/444	116165986	MDEyOklzc3VlQ29tbWVudDExNjE2NTk4Ng==	1217238	2015-06-27T23:40:29Z	2015-06-27T23:40:29Z	MEMBER	Of course, concurrent access to HDF5 files works fine on my laptop, using Anaconda's build of HDF5 (version 1.8.14). I have no idea what special flags they invoked when building it :). That said, I have been unable to produce any benchmarks that show improved performance when simply doing multithreaded reads without doing any computation (e.g., `%time xray.open_dataset(..., chunks=...).load()`). Even when I'm reading multiple independent chunks compressed on disk, CPU seems to be pegged at 100%, when using either netCDF4-python or h5py (via h5netcdf) to read the data. For non-compressed data, reads seem to be limited by disk speed, so CPU is also not relevant. Given these considerations, it seems like we should use a lock when reading data into xray with dask. @mrocklin we could just use `lock=True` with `da.from_array`, right? If we can find use cases for multi-threaded reads, we could also add an optional `lock` argument to `open_dataset`/`open_mfdataset`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		91184107