html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/793#issuecomment-199547343,https://api.github.com/repos/pydata/xarray/issues/793,199547343,MDEyOklzc3VlQ29tbWVudDE5OTU0NzM0Mw==,1217238,2016-03-22T00:01:52Z,2016-03-22T00:01:52Z,MEMBER,"This should be pretty easy -- we'll just need to add `lock=threading.Lock()` to this line: https://github.com/pydata/xarray/blob/v0.7.2/xarray/backends/common.py#L165 The only subtlety is that this needs to be done in a way that is dependent on the version of dask, because the keyword argument is new -- something like `if dask.__version__ > '0.8.1'`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-196924992,https://api.github.com/repos/pydata/xarray/issues/793,196924992,MDEyOklzc3VlQ29tbWVudDE5NjkyNDk5Mg==,1217238,2016-03-15T17:04:57Z,2016-03-15T17:27:29Z,MEMBER,"I did a little digging into this and I'm pretty sure the issue here is that HDF5 [cannot do multi-threading](https://www.hdfgroup.org/hdf5-quest.html#gconc) -- at all. Moreover, many HDF5 builds are not thread safe. Right now, we use a single shared lock for all _reads_ with xarray, but for writes we rely on dask.array.store, which [only uses different locks for each array it writes](https://github.com/dask/dask/blob/0.8.1/dask/array/core.py#L1968). Because @pwolfram's HDF5 file includes multiple variables, each of these gets written with their own thread lock -- which means we end up writing to the same file simultaneously from multiple threads. So what we could really use here is a `lock` argument to `dask.array.store` (like `dask.array.from_array`) that lets us insist on a using a shared lock when we're writing HDF5 files. Also, we may need to share that same lock between reading and writing data -- I'm not 100% sure. But at the very least we definitely need a lock to stop HDF5 from trying to do multi-threaded writes, whether that's to the same or different files. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-196935638,https://api.github.com/repos/pydata/xarray/issues/793,196935638,MDEyOklzc3VlQ29tbWVudDE5NjkzNTYzOA==,306380,2016-03-15T17:26:41Z,2016-03-15T17:26:41Z,MEMBER,"https://github.com/dask/dask/pull/1053 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-195811381,https://api.github.com/repos/pydata/xarray/issues/793,195811381,MDEyOklzc3VlQ29tbWVudDE5NTgxMTM4MQ==,306380,2016-03-12T21:32:56Z,2016-03-12T21:32:56Z,MEMBER,"To be clear, we ran into the `NetCDF: HDF error` error when having multiple threads in the same process open-read-close many different files. I don't think there was any concurrent access of the same file. The problem went away when we switched to using processes rather than threads. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-195637636,https://api.github.com/repos/pydata/xarray/issues/793,195637636,MDEyOklzc3VlQ29tbWVudDE5NTYzNzYzNg==,1217238,2016-03-12T02:19:18Z,2016-03-12T02:19:18Z,MEMBER,"I'm pretty sure we now have a thread lock around all writes to NetCDF files, but it's possible that isn't aggressive enough (maybe we can't safely read and write a different file at the same time?). If your script works with synchronous execution I'll take another look. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-195573297,https://api.github.com/repos/pydata/xarray/issues/793,195573297,MDEyOklzc3VlQ29tbWVudDE5NTU3MzI5Nw==,306380,2016-03-11T22:13:28Z,2016-03-11T22:13:28Z,MEMBER,"Yes, my apologies for the typo. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-195562924,https://api.github.com/repos/pydata/xarray/issues/793,195562924,MDEyOklzc3VlQ29tbWVudDE5NTU2MjkyNA==,306380,2016-03-11T21:29:46Z,2016-03-11T21:29:46Z,MEMBER,"Sure. I'm not proposing any particular approach. I'm just supporting your previous idea that maybe the problem is having too many open file handles. It would be good to check this before diving into threading or concurrency issues. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221 https://github.com/pydata/xarray/issues/793#issuecomment-195557013,https://api.github.com/repos/pydata/xarray/issues/793,195557013,MDEyOklzc3VlQ29tbWVudDE5NTU1NzAxMw==,306380,2016-03-11T21:16:41Z,2016-03-11T21:16:41Z,MEMBER,"1024 might be a common open file handle limit. Some things to try to isolate the issue: 1. Try this with `dask.set_globals(get=dask.async.get_sync)` to turn off threading 2. Try just opening all of the files and see if the NetCDF error presents itself under normal operation ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,140291221