home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 399320127

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2242#issuecomment-399320127 https://api.github.com/repos/pydata/xarray/issues/2242 399320127 MDEyOklzc3VlQ29tbWVudDM5OTMyMDEyNw== 2443309 2018-06-22T04:51:54Z 2018-06-22T04:51:54Z MEMBER

I think, at least to some extent, the performance hit is to be expected. I don't think we should be opening the file more than once when using the serial or threaded schedulers so that may be a place where you can find some improvement. There will always be a performance hit when writing dask arrays to netcdf files chunk-by-chunk. For 1, there is a threading lock that limits parallel throughput. More importantly, the chunked writes are going to always be slower than larger reads coming directly from numpy arrays.

In your example above, the snippit @shoyer mentions should evaluate to autoclose=False. However, the profiling you mention seems to indicate the opposite. Perhaps we should start by digging deeper on that point.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  334633212
Powered by Datasette · Queries took 0.502ms · About: xarray-datasette