home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1223270563

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1223270563 PR_kwDOAMm_X843L_J2 6566 New inline_array kwarg for open_dataset 35968931 closed 0     11 2022-05-02T19:39:07Z 2022-05-11T22:12:24Z 2022-05-11T20:26:43Z MEMBER   0 pydata/xarray/pulls/6566

Exposes the inline_array kwarg of dask.array.from_array in xr.open_dataset, and ds/da/variable.chunk.

What setting this to True does is inline the array into the opening/chunking task, which avoids an an extra array object at the start of the task graph. That's useful because the presence of that single common task connecting otherwise independent parts of the graph can confuse the graph optimizer.

With open_dataset(..., inline_array=False):

With open_dataset(..., inline_array=True):

In our case (xGCM) this is important because once inlined the optimizer understands that all the remaining parts of the graph are embarrasingly-parallel, and realizes that it can fuze all our chunk-wise padding tasks into one padding task per chunk.

I think this option could help in any case where someone is opening data from a Zarr store (the reason we had this opener task) or a netCDF file.

The value of the kwarg should be kept optional because in theory inlining is a tradeoff between fewer tasks and more memory use, but I think there might be a case for setting the default to be True?

Questions: 1) How should I test this? 2) Should it default to False or True? 3) inline_array or inline? (inline_array doesn't really make sense for open_dataset, which creates multiple arrays)

  • [x] Closes #1895
  • [x] Tests added
  • [x] User visible changes (including notable bug fixes) are documented in whats-new.rst

@rabernat @jbusecke

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6566/reactions",
    "total_count": 3,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 2,
    "eyes": 0
}
    13221727 pull

Links from other tables

  • 3 rows from issues_id in issues_labels
  • 11 rows from issue in issue_comments
Powered by Datasette · Queries took 20.624ms · About: xarray-datasette