issue_comments: 392666250

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2190#issuecomment-392666250	https://api.github.com/repos/pydata/xarray/issues/2190	392666250	MDEyOklzc3VlQ29tbWVudDM5MjY2NjI1MA==	6404167	2018-05-29T06:27:52Z	2018-05-29T06:35:02Z	CONTRIBUTOR	@shoyer Thanks for your answer. Too bad. Maybe this could be documented in the 'dask' chapter? Or maybe even raise a warning when using open_dataset with `lock=False` on a netCDF4 file? Unfortunately there seems to be some conflicting information floating around, which is hard to spot for a non-expert like me. It might of course just be that xarray doesn't support it (yet). I think MPI-style opening is a whole different beast, right? For example: python-netcdf4 support parallel read in threads: https://github.com/Unidata/netcdf4-python/issues/536 python-netcdf4 MPI parallel write/read: https://github.com/Unidata/netcdf4-python/blob/master/examples/mpi_example.py http://unidata.github.io/netcdf4-python/#section13 Using h5py directly (not supported by xarray I think): http://docs.h5py.org/en/latest/mpi.html Seems to suggest multiple read is fine: https://github.com/dask/dask/issues/3074#issuecomment-359030028 You might have better luck using dask-distributed multiple processes, but then you'll encounter other bottlenecks with data transfer. I'll do some more experiments, thanks for this suggestion. I am not bound to netCDF4 (although I need the compression, so no netCDF3 unfortunately), so would moving to Zarr help improving IO performance? I'd really like to keep using xarray, thanks for this awesome library! Even with the disk IO performance hit, it's still more than worth it to use it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		327064908