home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 417076301

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2389#issuecomment-417076301 https://api.github.com/repos/pydata/xarray/issues/2389 417076301 MDEyOklzc3VlQ29tbWVudDQxNzA3NjMwMQ== 1217238 2018-08-29T19:29:56Z 2018-08-29T19:29:56Z MEMBER

If I understand the heuristics used by dask's schedulers correctly, a data dependency might actually be a good idea here because it would encourage colocating write tasks on the same machines. We should probably give this a try. On Wed, Aug 29, 2018 at 12:15 PM Matthew Rocklin notifications@github.com wrote:

It would be nice if dask had a way to consolidate the serialization of these objects, rather than separately serializing them in each task.

You can make it a separate task (often done by wrapping with dask.delayed) and then use that key within other objets. This does create a data dependency though, which can make the graph somewhat more complex.

In normal use of Pickle these things are cached and reused. Unfortunately we can't do this because we're sending the tasks to different machines, each of which will need to deserialize independently.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2389#issuecomment-417072024, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1q8fMKCsVKmxjvANnMFS2Rn_6_6Jks5uVug-gaJpZM4WSBVj .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  355264812
Powered by Datasette · Queries took 0.693ms · About: xarray-datasette