issue_comments: 635099870

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/4100#issuecomment-635099870	https://api.github.com/repos/pydata/xarray/issues/4100	635099870	MDEyOklzc3VlQ29tbWVudDYzNTA5OTg3MA==	1217238	2020-05-28T04:55:01Z	2020-05-28T04:55:01Z	MEMBER	Thanks for the clear report! I know we use backend-specific locks by default when opening netCDF files, so I was initially puzzled by this. But now that I've looked back over the implementation, this makes sense. We currently only guarantee thread safety when reading data after files have been opened. For example, you could write something like: `python dataset = xr.open_dataset(SAVED_FILE_NAME, engine="netcdf4") threads = [ threading.Thread(target=lambda: do_something_with_xarray(dataset)) for _ in range(N_THREADS) ]` For many use-cases (e.g., in dask), this is a sufficient form of parallelism, because xarray's file opening is lazy and only needs to read metadata, not array values. It would indeed be nice if `open_dataset()` itself were thread safe. Mostly I think this could be achieved by making use of the existing `lock` attribute found on NetCDF4DataStore and most other DataStore classes.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		626042217