issue_comments
1 row where issue = 567678992 and user = 7933853 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- to_netcdf() doesn't work with multiprocessing scheduler · 1 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
702348129 | https://github.com/pydata/xarray/issues/3781#issuecomment-702348129 | https://api.github.com/repos/pydata/xarray/issues/3781 | MDEyOklzc3VlQ29tbWVudDcwMjM0ODEyOQ== | lvankampenhout 7933853 | 2020-10-01T19:24:48Z | 2020-10-01T20:00:27Z | NONE | I think I ran into a similar problem when combining dask-chunked DataSets (originating from MCVE Code Sample ```python import xarray as xr from multiprocessing import Pool import os if (False): """ Load data without using dask """ ds = xr.open_dataset("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1960.nc") else: """ Load data using dask """ ds = xr.open_dataset("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1960.nc", chunks={}) print(ds.nbytes / 1e6, 'MB') print('chunks', ds.air.chunks) # chunks is empty without dask outdir = '/glade/scratch/lvank' # change this to some temporary directory on your system def do_work(n): print(n) ds.to_netcdf(os.path.join(outdir, f'{n}.nc')) tasks = range(10) with Pool(processes=2) as pool: pool.map(do_work, tasks) print('done') ``` Expected Output
The NetCDF copies in Problem Description
In the case with Dask, when the if-statement evaluates to Output of xr.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.8.5 (default, Sep 4 2020, 07:30:14)
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 3.10.0-1127.13.1.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.7.3
xarray: 0.16.1
pandas: 1.1.1
numpy: 1.19.1
scipy: 1.5.2
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.2.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2.27.0
distributed: 2.28.0
matplotlib: 3.3.1
cartopy: None
seaborn: None
numbagg: None
pint: None
setuptools: 49.6.0.post20200925
pip: 20.2.2
conda: None
pytest: None
IPython: 7.18.1
sphinx: None
```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf() doesn't work with multiprocessing scheduler 567678992 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1