home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 277543644

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1058#issuecomment-277543644 https://api.github.com/repos/pydata/xarray/issues/1058 277543644 MDEyOklzc3VlQ29tbWVudDI3NzU0MzY0NA== 6213168 2017-02-05T19:44:33Z 2017-02-05T19:44:33Z MEMBER

Actually, I very much still am facing the problem. The biggest issue is now when I need to invoke xarray.broadcast. In my use case, I'm broadcasting together

  • a scalar array with numpy backend, shape=(), chunks=None
  • a 1D array with dask backend, shape=(2**19,), chunks=(2**15,)

What broadcast does is transform the scalar array to a numpy array of 2**19 elements. This is actually a view on the original 0D array, so it's got negligible RAM requirements. But after pickling and unpickling, it's become a real 2**19 elements array. Add up a few hundreds of them, and I am facing GBs of wasted RAM.

A solution would be to change broadcast() to convert to dask before broadcasting, and then broadcast directly to the proper chunk size.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  184722754
Powered by Datasette · Queries took 82.825ms · About: xarray-datasette