home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 172290413

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
172290413 MDU6SXNzdWUxNzIyOTA0MTM= 978 broadcast() broken on dask backend 6213168 closed 0     4 2016-08-20T20:56:33Z 2016-12-09T20:28:42Z 2016-12-09T20:28:42Z MEMBER      

``` python

a = xarray.DataArray([1,2]).chunk(1) a <xarray.DataArray (dim_0: 2)> dask.array<xarray-..., shape=(2,), dtype=int64, chunksize=(1,)> Coordinates: * dim_0 (dim_0) int64 0 1 xarray.broadcast(a) (<xarray.DataArray (dim_0: 2)> array([1, 2]) Coordinates: * dim_0 (dim_0) int64 0 1,) ```

The problem is actually somewhere in the constructor of DataArray. In alignment.py:362, we have return DataArray(data, ...) where data is a Variable with dask backend. The returned DataArray object has a numpy backend. As a workaround, changing that line to return DataArray(data.data, ...) (thus passing a dask array) fixes the problem.

After that however there's a new issue: whenever broadcast adds a dimension to an array, it creates it in a single chunk, as opposed to copying the chunking of the other arrays. This can easily call a host to go out of memory, and makes it harder to work with the arrays afterwards because chunks won't match.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/978/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.557ms · About: xarray-datasette