issues: 1977661256
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1977661256 | I_kwDOAMm_X8514LdI | 8414 | Is there any way of having `.map_blocks` be even more opaque to dask? | 5635139 | closed | 0 | 23 | 2023-11-05T06:56:43Z | 2023-12-12T18:14:57Z | 2023-12-12T18:14:57Z | MEMBER | Is your feature request related to a problem?Currently I have a workload which does something a bit like:
(the actual calc is a bit more complicated! And while I don't have a MVCE of the full calc, I pasted a task graph below) Dask — while very impressive in many ways — handles this extremely badly, because it attempts to load the whole of Describe the solution you'd likeI was hoping to make the internals of this task opaque to dask, so it became a much dumber task runner — just map over the blocks, running the function and writing the result, block by block. I thought I had some success with Is there any way to make the write more opaque too? Describe alternatives you've consideredI've built a homegrown thing which is really hacky which does this on a custom scheduler — just runs the functions and writes with Additional context(It's also possible I'm making some basic error — and I do remember it working much better last week — so please feel free to direct me / ask me for more examples, if this doesn't ring true) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8414/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |