home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 107424151

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
107424151 MDU6SXNzdWUxMDc0MjQxNTE= 585 Parallel map/apply powered by dask.array 1217238 closed 0   741199 11 2015-09-20T23:27:55Z 2017-10-13T15:58:30Z 2017-10-09T23:26:06Z MEMBER      

Dask is awesome, but it isn't always easy to use it for parallel operations. In many cases, especially when wrapping routines from external libraries, it is most straightforward to express operations in terms of a function that expects and returns xray objects loaded into memory.

Dask array has a map_blocks function/method, but it's applicability is limited because dask.array doesn't have axis names for unambiguously identifying dimensions. da.atop can handle many of these cases, but it's not the easiest to use. Fortunately, we have sufficient metadata in xray that we could probably parallelize many atop operations automatically by inferring result dimensions and dtypes from applying the function once. See here for more discussion on the dask side: https://github.com/blaze/dask/issues/702

So I would like to add some convenience methods for automatic parallelization with dask of a function defined on xray objects loaded into memory. In addition to a map_blocks method/function, it would be useful to add some sort of parallel_apply method to groupby objects that works very similarly, by lazily applying a function that takes and returns xray objects loaded into memory.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/585/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 11 rows from issue in issue_comments
Powered by Datasette · Queries took 479.611ms · About: xarray-datasette