issue_comments
5 rows where issue = 334633212 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- to_netcdf(compute=False) can be slow · 5 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
453866106 | https://github.com/pydata/xarray/issues/2242#issuecomment-453866106 | https://api.github.com/repos/pydata/xarray/issues/2242 | MDEyOklzc3VlQ29tbWVudDQ1Mzg2NjEwNg== | jhamman 2443309 | 2019-01-13T21:13:28Z | 2019-01-13T21:13:28Z | MEMBER | I just reran the example above and things seem to be resolved now. The write step for the two datasets is basically identical. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf(compute=False) can be slow 334633212 | |
399503156 | https://github.com/pydata/xarray/issues/2242#issuecomment-399503156 | https://api.github.com/repos/pydata/xarray/issues/2242 | MDEyOklzc3VlQ29tbWVudDM5OTUwMzE1Ng== | shoyer 1217238 | 2018-06-22T16:33:11Z | 2018-06-22T16:33:11Z | MEMBER | This autoclose business is really hard to reason about in its current version, as part of the backend class. I'm hoping that refactoring it out into a separate object that we can use with composition instead of inheritance will help (e.g., alongside PickleByReconstructionWrapper). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf(compute=False) can be slow 334633212 | |
399495668 | https://github.com/pydata/xarray/issues/2242#issuecomment-399495668 | https://api.github.com/repos/pydata/xarray/issues/2242 | MDEyOklzc3VlQ29tbWVudDM5OTQ5NTY2OA== | neishm 1554921 | 2018-06-22T16:10:45Z | 2018-06-22T16:10:45Z | CONTRIBUTOR | True, I would expect some performance hit due to writing chunk-by-chunk, however that same performance hit is present in both of the test cases. In addition to the snippet @shoyer mentioned, I found that xarray also intentionally uses However, So if the file is already open before getting to If I remove the |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf(compute=False) can be slow 334633212 | |
399320127 | https://github.com/pydata/xarray/issues/2242#issuecomment-399320127 | https://api.github.com/repos/pydata/xarray/issues/2242 | MDEyOklzc3VlQ29tbWVudDM5OTMyMDEyNw== | jhamman 2443309 | 2018-06-22T04:51:54Z | 2018-06-22T04:51:54Z | MEMBER | I think, at least to some extent, the performance hit is to be expected. I don't think we should be opening the file more than once when using the serial or threaded schedulers so that may be a place where you can find some improvement. There will always be a performance hit when writing dask arrays to netcdf files chunk-by-chunk. For 1, there is a threading lock that limits parallel throughput. More importantly, the chunked writes are going to always be slower than larger reads coming directly from numpy arrays. In your example above, the snippit @shoyer mentions should evaluate to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf(compute=False) can be slow 334633212 | |
399275847 | https://github.com/pydata/xarray/issues/2242#issuecomment-399275847 | https://api.github.com/repos/pydata/xarray/issues/2242 | MDEyOklzc3VlQ29tbWVudDM5OTI3NTg0Nw== | shoyer 1217238 | 2018-06-21T23:37:10Z | 2018-06-21T23:37:10Z | MEMBER | I suspect this can be improved. Looking at the code, it appears that we only intentionally use |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
to_netcdf(compute=False) can be slow 334633212 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3