home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 478415169

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2852#issuecomment-478415169 https://api.github.com/repos/pydata/xarray/issues/2852 478415169 MDEyOklzc3VlQ29tbWVudDQ3ODQxNTE2OQ== 1217238 2019-04-01T02:31:58Z 2019-04-01T02:31:58Z MEMBER

The current design of GroupBy.apply() in xarray is entirely ignorant of dask: it simply uses a for loop over the grouped variable to built up a computation with high level array operations.

This makes operations that group over large keys stored in dask inefficient. This could be done efficiently (dask.dataframe does this, and might be worth trying in your case) but it's a more challenging distributed computing problem, and xarray's current data model would not know how large of a dimension to create for the returned ararys (doing this properly would require supporting arrays with unknown dimension sizes).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  425320466
Powered by Datasette · Queries took 0.779ms · About: xarray-datasette