issue_comments
7 rows where issue = 372848074 and user = 1872600 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- open_mfdataset usage and limitations. · 7 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
510144707 | https://github.com/pydata/xarray/issues/2501#issuecomment-510144707 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUxMDE0NDcwNw== | rsignell-usgs 1872600 | 2019-07-10T16:59:12Z | 2019-07-11T11:47:02Z | NONE | @TomAugspurger , I sat down here at Scipy with @rabernat and he instantly realized that we needed to drop the So if I use this code, the I'm now running into memory issues when I write the zarr data -- but I should raise that as a new issue, right? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
509379294 | https://github.com/pydata/xarray/issues/2501#issuecomment-509379294 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUwOTM3OTI5NA== | rsignell-usgs 1872600 | 2019-07-08T20:28:48Z | 2019-07-08T20:29:20Z | NONE | @TomAugspurger , I thought @rabernat's suggestion of implementing
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
509341467 | https://github.com/pydata/xarray/issues/2501#issuecomment-509341467 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUwOTM0MTQ2Nw== | rsignell-usgs 1872600 | 2019-07-08T18:34:02Z | 2019-07-08T18:34:02Z | NONE | @rabernat , to answer your question, if I open just two files:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
509340139 | https://github.com/pydata/xarray/issues/2501#issuecomment-509340139 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUwOTM0MDEzOQ== | rsignell-usgs 1872600 | 2019-07-08T18:30:18Z | 2019-07-08T18:30:18Z | NONE | @TomAugspurger, okay, I just ran the above code again and here's what happens: The Then, despite the tasks showing on the dashboard being completed, the then after about 10 more minutes, I get these warnings:
and then the errors:
Does this help? I'd be happy to screenshare if that would be useful. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
509282831 | https://github.com/pydata/xarray/issues/2501#issuecomment-509282831 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUwOTI4MjgzMQ== | rsignell-usgs 1872600 | 2019-07-08T15:51:23Z | 2019-07-08T15:51:23Z | NONE | @TomAugspurger, I'm back from vacation now and ready to attack this again. Any updates on your end? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
506475819 | https://github.com/pydata/xarray/issues/2501#issuecomment-506475819 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDUwNjQ3NTgxOQ== | rsignell-usgs 1872600 | 2019-06-27T19:16:28Z | 2019-06-27T19:24:31Z | NONE | I tried this, and either I didn't apply it right, or it didn't work. The memory use kept growing until the process died. My code to process the 8760 netcdf files with ```python import xarray as xr from dask.distributed import Client, progress, LocalCluster cluster = LocalCluster() client = Client(cluster) import pandas as pd dates = pd.date_range(start='2009-01-01 00:00',end='2009-12-31 23:00', freq='1h') files = ['./nc/{}/{}.CHRTOUT_DOMAIN1.comp'.format(date.strftime('%Y'),date.strftime('%Y%m%d%H%M')) for date in dates] def drop_coords(ds): return ds.reset_coords(drop=True) ds = xr.open_mfdataset(files, preprocess=drop_coords, autoclose=True, parallel=True) ds1 = ds.chunk(chunks={'time':168, 'feature_id':209929}) import numcodecs numcodecs.blosc.use_threads = False ds1.to_zarr('zarr/2009', mode='w', consolidated=True) ``` I transfered the netcdf files from AWS S3 to my local disk to run this, using this command:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 | |
497381301 | https://github.com/pydata/xarray/issues/2501#issuecomment-497381301 | https://api.github.com/repos/pydata/xarray/issues/2501 | MDEyOklzc3VlQ29tbWVudDQ5NzM4MTMwMQ== | rsignell-usgs 1872600 | 2019-05-30T15:55:56Z | 2019-05-30T15:58:48Z | NONE | I'm hitting some memory issues with using Specifically, I'm trying to open 8760 NetCDF files with an 8 node, 40 cpu LocalCluster. When I issue:
Then 4 more minutes go by before I get a bunch of errors like:
Any suggestions? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
open_mfdataset usage and limitations. 372848074 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1