issue_comments
7 rows where author_association = "NONE" and issue = 694112301 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Threading Lock issue with to_netcdf and Dask arrays · 7 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
988359778 | https://github.com/pydata/xarray/issues/4406#issuecomment-988359778 | https://api.github.com/repos/pydata/xarray/issues/4406 | IC_kwDOAMm_X8466Sxi | tasansal 13684161 | 2021-12-08T00:05:24Z | 2021-12-08T00:06:22Z | NONE | I am having a similar issue as well. Using latest versions of dask, xarray, distributed, fsspec, and gcsfs. I use h5netcdf backend because it is the only one that works with fsspec's binary stream, reading from cloud. My workflow consists of: 1. Start dask client with 1 process per CPU, and 2 threads each. This is because it doesn't scale up reading from the cloud with threads. 2. Opening 12x monthly climate data (hourly sampled) using xarray.open_mfdataset 3. Using reasonable dask chunks in the open function 4. Take monthly average across time axis, and write to local NetCDF. 5. Repeate 2-4 for different years. It is a hit or miss. It hangs towards the middle or end of a year. Next time I run it, it doesn't. Once it hangs, and I hit stop, in the traceback it is stuck at await of threading lock. Any ideas how to avoid this? Things I tried: 1. Use processes only, 1 thread per worker 2. lock=True, lock=False on open_mfdataset 3. Dask scheduler as: spawn and forkserver 4. Different (but recent) versions of all the libraries |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
691756409 | https://github.com/pydata/xarray/issues/4406#issuecomment-691756409 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY5MTc1NjQwOQ== | hansukyang 11863789 | 2020-09-14T01:03:54Z | 2020-09-14T01:03:54Z | NONE | Good point! Yes, after a bit of trial and error, this is what I did. Is there any limitation when over-writing an existing NetCDF file that hasn't been opened by xarray? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
691670151 | https://github.com/pydata/xarray/issues/4406#issuecomment-691670151 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY5MTY3MDE1MQ== | hansukyang 11863789 | 2020-09-13T13:15:30Z | 2020-09-13T13:15:30Z | NONE | For my case, I saw this happen only when I started to run xarray scripts with cron, about a month ago. I would run it once every six hours and every day or so, I would see a NetCDF file locked up. I ended up changing the work flow somewhat so I don't do this any more (was using xarray to manipulate NetCDF and re-write to it) but this was confusing me for quite a while. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
691096342 | https://github.com/pydata/xarray/issues/4406#issuecomment-691096342 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY5MTA5NjM0Mg== | bilelomrani1 16692099 | 2020-09-11T13:30:29Z | 2020-09-11T13:30:29Z | NONE |
I am working on my first project using Dask arrays in cunjunction with
Without the option |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
690922236 | https://github.com/pydata/xarray/issues/4406#issuecomment-690922236 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY5MDkyMjIzNg== | bekatd 6948919 | 2020-09-11T07:18:03Z | 2020-09-11T07:18:03Z | NONE |
I am new to xarray dask thing but month ago it was woking without issues. I recently reinstalled python and dont know if versions differs from previous one |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
690340047 | https://github.com/pydata/xarray/issues/4406#issuecomment-690340047 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY5MDM0MDA0Nw== | bekatd 6948919 | 2020-09-10T14:48:38Z | 2020-09-10T14:48:38Z | NONE | Using:
I am experiencing same when trying to write netcdf file using to_netcdf() on a files opened via xr.open_mfdataset with lock=None (which is default). Then I tried to open files with lock=False and it worked like a charm. Issue have been gone for 100% of times. BUT Now I am facing different issue. Seems that hdf5 IS NOT thread safe, since I encounter NetCDF: HDF error while applying different function on a netcdf files which were previously processed by another functions with lock=False. Script just terminates not even reaching any calculation step in the code. seems like lock=False works opposite and file is in a corrupted mode? This is the BIGGEST issue and needs resolve ASAP |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 | |
688101357 | https://github.com/pydata/xarray/issues/4406#issuecomment-688101357 | https://api.github.com/repos/pydata/xarray/issues/4406 | MDEyOklzc3VlQ29tbWVudDY4ODEwMTM1Nw== | hansukyang 11863789 | 2020-09-07T07:28:37Z | 2020-09-07T07:46:39Z | NONE | I seem to also have similar issue, running under docker/linux environment. It doesn't happen always, maybe once out of 4~5 times. Wondering if this is related to NetCDF/HDF5 file locking issue (https://support.nesi.org.nz/hc/en-gb/articles/360000902955-NetCDF-HDF5-file-locking). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Threading Lock issue with to_netcdf and Dask arrays 694112301 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 4