issues: 1976752481
This data as json
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1976752481 | PR_kwDOAMm_X85ekPdj | 8412 | Minimize duplication in `map_blocks` task graph | 2448579 | closed | 0 | 7 | 2023-11-03T18:30:02Z | 2024-01-03T04:10:17Z | 2024-01-03T04:10:15Z | MEMBER | 0 | pydata/xarray/pulls/8412 | Builds on #8560
cc @max-sixty ``` print(len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x)))) 779354739 -> 47699827print(len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x)))) 15981508``` This is a quick attempt. I think we can generalize this to minimize duplication. The downside is that the graphs are not totally embarrassingly parallel any more.
This PR:
vs main:
|
{
"url": "https://api.github.com/repos/pydata/xarray/issues/8412/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
13221727 | pull |