issues: 189415576
This data as json
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 189415576 | MDU6SXNzdWUxODk0MTU1NzY= | 1121 | Performance degradation: `DataArray` with `dtype=object` of `DataArray` gets very slow indexing | 1310437 | closed | 0 | 3 | 2016-11-15T15:04:29Z | 2016-11-15T17:36:26Z | 2016-11-15T17:36:26Z | CONTRIBUTOR | I did not follow the code deeply, but there clearly seems to be a huge overhead when indexing such arrays. In particular, in the following code ```python import xarray as xr import numpy as np a = xr.DataArray([None for k in range(100)],dims='c') for k in range(a.c.size): a[k] = xr.DataArray(np.random.randn(1000,5),dims=['a','b']) %prun a[0] ``` the indexing operation takes about 1 second on my machine, or 2 seconds when running under the profiler. The profiler output shows lots of functions with a recursive call count exceeding 100'000 (most likely iterating through each row of the contained sub-arrays). However, there is really no reason to iterate through the nested elements. |
{
"url": "https://api.github.com/repos/pydata/xarray/issues/1121/reactions",
"total_count": 0,
"+1": 0,
"-1": 0,
"laugh": 0,
"hooray": 0,
"confused": 0,
"heart": 0,
"rocket": 0,
"eyes": 0
} |
completed | 13221727 | issue |