home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 417060359

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2389#issuecomment-417060359 https://api.github.com/repos/pydata/xarray/issues/2389 417060359 MDEyOklzc3VlQ29tbWVudDQxNzA2MDM1OQ== 1882397 2018-08-29T18:37:57Z 2018-08-29T18:40:16Z NONE

pangeo-data/gangeo#266 sounds somewhat similar. If you increase the size of the involved arrays here, you also end up with warnings about the size of the graph: https://stackoverflow.com/questions/52039697/how-to-avoid-large-objects-in-task-graph

I haven't tried with #2261 applied, but I can try that tomorrow.

If we interpret the time spent in _thread.lock as the time the main process is waiting for the workers, then that doesn't seem to be that main problem here. We spend 60s in pickle (almost all the time), and only 7s waiting for locks. I tried looking at the contents of the graph a bit (write.dask.dicts) and compared that to the graph of the dataset itself (ds.vals.data.dask.dicts). I can't pickle those for some reason (that would be great to see where it is spending all that time), but it looks like those entries the main difference: ( <function dask.array.core.store_chunk(x, out, index, lock, return_stored)>, ( 'stack-6ab3acdaa825862b99d6dbe1c75f0392', 478 ), <xarray.backends.netCDF4_.NetCDF4ArrayWrapper at 0x32fc365c0>, (slice(478, 479, None), ), CombinedLock([<SerializableLock: 0ccceef3-44cd-41ed-947c-f7041ae280c8>, <distributed.lock.Lock object at 0x32fb058d0>]), False), I don't really know how they work, but maybe pickeling those NetCDF4ArrayWrapper objects is expensive (ie they contain a reference to something they shouldn't)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  355264812
Powered by Datasette · Queries took 73.167ms · About: xarray-datasette