html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4406#issuecomment-988359778,https://api.github.com/repos/pydata/xarray/issues/4406,988359778,IC_kwDOAMm_X8466Sxi,13684161,2021-12-08T00:05:24Z,2021-12-08T00:06:22Z,NONE,"I am having a similar issue as well. Using latest versions of dask, xarray, distributed, fsspec, and gcsfs. I use h5netcdf backend because it is the only one that works with fsspec's binary stream, reading from cloud.
My workflow consists of:
1. Start dask client with 1 process per CPU, and 2 threads each. This is because it doesn't scale up reading from the cloud with threads.
2. Opening 12x monthly climate data (hourly sampled) using xarray.open_mfdataset
3. Using reasonable dask chunks in the open function
4. Take monthly average across time axis, and write to local NetCDF.
5. Repeate 2-4 for different years.
It is a hit or miss. It hangs towards the middle or end of a year. Next time I run it, it doesn't.
Once it hangs, and I hit stop, in the traceback it is stuck at await of threading lock.
Any ideas how to avoid this?
Things I tried:
1. Use processes only, 1 thread per worker
2. lock=True, lock=False on open_mfdataset
3. Dask scheduler as: spawn and forkserver
4. Different (but recent) versions of all the libraries","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-785432974,https://api.github.com/repos/pydata/xarray/issues/4406,785432974,MDEyOklzc3VlQ29tbWVudDc4NTQzMjk3NA==,1828519,2021-02-24T22:42:15Z,2021-02-24T22:42:15Z,CONTRIBUTOR,"I'm having a similar issue to what is described here, but I'm seeing it even when I'm not rewriting an output file (although it is an option in my code). I have a delayed function that is calling `to_netcdf` and seem to run into some race condition where I get the same deadlock as the original poster. It seems highly dependent on the number of dask tasks and the number of workers. I *think* I've gotten around it for now by having my delayed function return the Dataset it is working on and then calling `to_dataset` later. My problem is I have cases where I might not want to write the file so my delayed function returns `None`. To handle this I need to pre-compute my delayed functions before calling `to_dataset` since I don't think there is a way to pass something to `to_dataset` so it doesn't create a file.
With the original code it happened quite a bit but was part of a much larger application so I can't really get a MWE together. Just wanted to mention it here as another data point (to_netcdf inside a Delayed function may not work 100% of the time).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-691756409,https://api.github.com/repos/pydata/xarray/issues/4406,691756409,MDEyOklzc3VlQ29tbWVudDY5MTc1NjQwOQ==,11863789,2020-09-14T01:03:54Z,2020-09-14T01:03:54Z,NONE,"Good point! Yes, after a bit of trial and error, this is what I did. Is there any limitation when over-writing an existing NetCDF file that hasn't been opened by xarray?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-691717920,https://api.github.com/repos/pydata/xarray/issues/4406,691717920,MDEyOklzc3VlQ29tbWVudDY5MTcxNzkyMA==,1217238,2020-09-13T20:01:57Z,2020-09-13T20:01:57Z,MEMBER,"> was using xarray to manipulate NetCDF and re-write to it
We should probably document this more clearly, but opening and then rewriting the _same_ file in xarray without closing the original file is not something xarray supports.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-691670151,https://api.github.com/repos/pydata/xarray/issues/4406,691670151,MDEyOklzc3VlQ29tbWVudDY5MTY3MDE1MQ==,11863789,2020-09-13T13:15:30Z,2020-09-13T13:15:30Z,NONE,"For my case, I saw this happen only when I started to run xarray scripts with cron, about a month ago. I would run it once every six hours and every day or so, I would see a NetCDF file locked up. I ended up changing the work flow somewhat so I don't do this any more (was using xarray to manipulate NetCDF and re-write to it) but this was confusing me for quite a while.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-691096342,https://api.github.com/repos/pydata/xarray/issues/4406,691096342,MDEyOklzc3VlQ29tbWVudDY5MTA5NjM0Mg==,16692099,2020-09-11T13:30:29Z,2020-09-11T13:30:29Z,NONE,"> Did this work reliably in the past? If so, any clues about specific versions of dask and/or netCDF that cause the issue would be helpful.
I am working on my first project using Dask arrays in cunjunction with `xarray` so I cannot tell if previous combinations of versions worked. I tried downgrading `dask` down to v2.20 but the issue is still here.
> This is just using Dask's threaded scheduler, right? I don't recall any changes there recently.
Without the option `chunks={'time': 200}` the previous snippet seems to work very reliably.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-691083939,https://api.github.com/repos/pydata/xarray/issues/4406,691083939,MDEyOklzc3VlQ29tbWVudDY5MTA4MzkzOQ==,1312546,2020-09-11T13:07:00Z,2020-09-11T13:07:00Z,MEMBER,"> @TomAugspurger do you know off-hand if there have been any recent changes in Dask's scheduler that could have caused this?
This is just using Dask's threaded scheduler, right? I don't recall any changes there recently.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-690922236,https://api.github.com/repos/pydata/xarray/issues/4406,690922236,MDEyOklzc3VlQ29tbWVudDY5MDkyMjIzNg==,6948919,2020-09-11T07:18:03Z,2020-09-11T07:18:03Z,NONE,"> Did this work reliably in the past? If so, any clues about specific versions of dask and/or netCDF that cause the issue would be helpful.
>
> @TomAugspurger do you know off-hand if there have been any recent changes in Dask's scheduler that could have caused this?
I am new to xarray dask thing but month ago it was woking without issues. I recently reinstalled python and dont know if versions differs from previous one","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-690912480,https://api.github.com/repos/pydata/xarray/issues/4406,690912480,MDEyOklzc3VlQ29tbWVudDY5MDkxMjQ4MA==,1217238,2020-09-11T06:53:58Z,2020-09-11T06:53:58Z,MEMBER,"Did this work reliably in the past? If so, any clues about specific versions of dask and/or netCDF that cause the issue would be helpful.
@TomAugspurger do you know off-hand if there have been any recent changes in Dask's scheduler that could have caused this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-690340047,https://api.github.com/repos/pydata/xarray/issues/4406,690340047,MDEyOklzc3VlQ29tbWVudDY5MDM0MDA0Nw==,6948919,2020-09-10T14:48:38Z,2020-09-10T14:48:38Z,NONE,"Using:
- xarray=0.16.0
- dask=2.25.0
- netcdf4=1.5.4
I am experiencing same when trying to write netcdf file using **to_netcdf()** on a files opened via xr.open_mfdataset with lock=None (which is default).
Then I tried to open files with **lock=False** and it worked like a charm. Issue have been gone for 100% of times.
**BUT**
Now I am facing different issue. Seems that **hdf5 IS NOT thread safe**, since I encounter **NetCDF: HDF error** while applying different function on a netcdf files which were previously processed by another functions with **lock=False**.
Script just terminates not even reaching any calculation step in the code. seems like lock=False works opposite and file is in a **corrupted** mode?
This is the **BIGGEST** issue and needs resolve ASAP","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301
https://github.com/pydata/xarray/issues/4406#issuecomment-688101357,https://api.github.com/repos/pydata/xarray/issues/4406,688101357,MDEyOklzc3VlQ29tbWVudDY4ODEwMTM1Nw==,11863789,2020-09-07T07:28:37Z,2020-09-07T07:46:39Z,NONE,"I seem to also have similar issue, running under docker/linux environment. It doesn't happen always, maybe once out of 4~5 times. Wondering if this is related to NetCDF/HDF5 file locking issue (https://support.nesi.org.nz/hc/en-gb/articles/360000902955-NetCDF-HDF5-file-locking).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,694112301