issues: 189415576
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
189415576 | MDU6SXNzdWUxODk0MTU1NzY= | 1121 | Performance degradation: `DataArray` with `dtype=object` of `DataArray` gets very slow indexing | 1310437 | closed | 0 | 3 | 2016-11-15T15:04:29Z | 2016-11-15T17:36:26Z | 2016-11-15T17:36:26Z | CONTRIBUTOR | I did not follow the code deeply, but there clearly seems to be a huge overhead when indexing such arrays. In particular, in the following code ```python import xarray as xr import numpy as np a = xr.DataArray([None for k in range(100)],dims='c') for k in range(a.c.size): a[k] = xr.DataArray(np.random.randn(1000,5),dims=['a','b']) %prun a[0] ``` the indexing operation takes about 1 second on my machine, or 2 seconds when running under the profiler. The profiler output shows lots of functions with a recursive call count exceeding 100'000 (most likely iterating through each row of the contained sub-arrays). However, there is really no reason to iterate through the nested elements. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1121/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |