github: issue_comments: 3 rows where author_association = "NONE" and issue = 567678992 sorted by updated

3 rows where author_association = "NONE" and issue = 567678992 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
776160305	https://github.com/pydata/xarray/issues/3781#issuecomment-776160305	https://api.github.com/repos/pydata/xarray/issues/3781	MDEyOklzc3VlQ29tbWVudDc3NjE2MDMwNQ==	tsupinie 885575	2021-02-09T18:51:13Z	2021-02-09T18:51:13Z	NONE	@lvankampenhout, I ran into your problem. OP's seems like it's actually in `to_netcdf()`, but I think yours (ours) is in Dask's lazy loading and therefore unrelated. In short, `ds` will have some Dask arrays whose contents don't actually get loaded until you call `to_netcdf()`. By default, Dask loads in parallel, and the default Dask parallel scheduler chokes when you do your own parallelism on top. In my case, I was able to get around it by doing `python ds.load(scheduler='sync')` at some point. If it's outside `do_work()`, I think you can skip the `scheduler='sync'` part, but inside `do_work()`, it's required. This bypasses the parallelism in Dask, which is probably what you want if you're doing your own parallelism.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	to_netcdf() doesn't work with multiprocessing scheduler 567678992
713165400	https://github.com/pydata/xarray/issues/3781#issuecomment-713165400	https://api.github.com/repos/pydata/xarray/issues/3781	MDEyOklzc3VlQ29tbWVudDcxMzE2NTQwMA==	Chrismarsh 630436	2020-10-20T22:01:24Z	2020-10-20T22:01:24Z	NONE	I am also hitting the problem as described by @bcbnz	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	to_netcdf() doesn't work with multiprocessing scheduler 567678992
702348129	https://github.com/pydata/xarray/issues/3781#issuecomment-702348129	https://api.github.com/repos/pydata/xarray/issues/3781	MDEyOklzc3VlQ29tbWVudDcwMjM0ODEyOQ==	lvankampenhout 7933853	2020-10-01T19:24:48Z	2020-10-01T20:00:27Z	NONE	I think I ran into a similar problem when combining dask-chunked DataSets (originating from `open_mfdataset`) with Python's native `multiprocessing` package. I get no error message, and the headers of the files are created, but then the script hangs indefinitely. The use case is: combining and resampling of variables into ~1000 different NetCDF files, which I want to distribute over different processes using `multiprocessing`. MCVE Code Sample ```python import xarray as xr from multiprocessing import Pool import os if (False): """ Load data without using dask """ ds = xr.open_dataset("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1960.nc") else: """ Load data using dask """ ds = xr.open_dataset("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1960.nc", chunks={}) print(ds.nbytes / 1e6, 'MB') print('chunks', ds.air.chunks) # chunks is empty without dask outdir = '/glade/scratch/lvank' # change this to some temporary directory on your system def do_work(n): print(n) ds.to_netcdf(os.path.join(outdir, f'{n}.nc')) tasks = range(10) with Pool(processes=2) as pool: pool.map(do_work, tasks) print('done') ``` Expected Output The NetCDF copies in `outdir` named `0.nc` to `9.nc` should be created for both cases (with and without Dask). Problem Description In the case with Dask, when the if-statement evaluates to `False`, the files are not created and the program hangs. Output of xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.13.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.7.3 xarray: 0.16.1 pandas: 1.1.1 numpy: 1.19.1 scipy: 1.5.2 netCDF4: 1.5.3 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.2.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.27.0 distributed: 2.28.0 matplotlib: 3.3.1 cartopy: None seaborn: None numbagg: None pint: None setuptools: 49.6.0.post20200925 pip: 20.2.2 conda: None pytest: None IPython: 7.18.1 sphinx: None ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	to_netcdf() doesn't work with multiprocessing scheduler 567678992

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);