html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/3054#issuecomment-520210672,https://api.github.com/repos/pydata/xarray/issues/3054,520210672,MDEyOklzc3VlQ29tbWVudDUyMDIxMDY3Mg==,2552981,2019-08-11T08:36:19Z,2019-08-11T08:36:41Z,CONTRIBUTOR,"@yohai : In short, no. It does not make sense to add a built-in function for iteration, if it is unable to augment the low-level functionality.
I'd recommend closing this PR!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420
https://github.com/pydata/xarray/pull/3054#issuecomment-520048334,https://api.github.com/repos/pydata/xarray/issues/3054,520048334,MDEyOklzc3VlQ29tbWVudDUyMDA0ODMzNA==,6213168,2019-08-09T20:10:29Z,2019-08-09T20:29:23Z,MEMBER,"Mh. Actually it looks like ``ndarray.flat`` is the fastest way to iterate over numpy. Still considerably slower than a CPython iterator though
```python
import numpy
N = 1000000
a = numpy.arange(N)
def exhaust(it):
for _ in it:
pass
%timeit exhaust(a)
24.8 ms ± 723 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit exhaust(a.flat)
20.4 ms ± 701 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit exhaust(a.tolist())
27.2 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit exhaust(range(N))
10.5 ms ± 234 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420
https://github.com/pydata/xarray/pull/3054#issuecomment-520016328,https://api.github.com/repos/pydata/xarray/issues/3054,520016328,MDEyOklzc3VlQ29tbWVudDUyMDAxNjMyOA==,6213168,2019-08-09T18:19:05Z,2019-08-09T18:19:05Z,MEMBER,"@yohai Iterating point by point in pure python over numpy data is horribly slow. ``numpy.ndarray.flat`` is there mostly to be used within cython/numba code. In a DataArray it's much worse than in a plain ndarray, because every time you invoke the slice operator to fetch a single element it's being applied to all coordinates too.
If you just need to iterate over the values of a DataArray, then ``DataArray.values.ravel().tolist()`` is the fastest option that I know of (``ndarray.tolist()`` is much faster than ``list(ndarray)``!).
If you need the coords as well, then I suspect you may be doing it wrong - could you show a simple example of your use case?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420
https://github.com/pydata/xarray/pull/3054#issuecomment-519932051,https://api.github.com/repos/pydata/xarray/issues/3054,519932051,MDEyOklzc3VlQ29tbWVudDUxOTkzMjA1MQ==,6164157,2019-08-09T14:04:57Z,2019-08-09T14:04:57Z,CONTRIBUTOR,"@crusaderky @corora Thanks for your comments, glad to see that there's a more efficient way to do it. The question is do you think it's useful enough to justify adding it as a built in function. I end up using my solution quite often","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420
https://github.com/pydata/xarray/pull/3054#issuecomment-519875542,https://api.github.com/repos/pydata/xarray/issues/3054,519875542,MDEyOklzc3VlQ29tbWVudDUxOTg3NTU0Mg==,6213168,2019-08-09T10:53:59Z,2019-08-09T10:58:00Z,MEMBER,"Indeed this is extremely inefficient.
I'm afraid it's a -1 from me.
You can get the same with a much faster one-liner: ``a.stack(__flat=a.dims).reset_index('__flat')`` (although admittedly it's more RAM-intensive).
On related notes,
- stack() could use a set_index=True optional parameter that avoids you from going through a MultiIndex if you don't need one
- stack() should accept non-string hashables; this would allow avoiding potential collisions (e.g. ``flat = object()``)
- there is an issue with the stack -> reset_index round-trip where it converts unicode variables to object #907
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420
https://github.com/pydata/xarray/pull/3054#issuecomment-508407631,https://api.github.com/repos/pydata/xarray/issues/3054,508407631,MDEyOklzc3VlQ29tbWVudDUwODQwNzYzMQ==,2552981,2019-07-04T09:15:14Z,2019-07-04T09:15:14Z,CONTRIBUTOR,"@yohai It's a lot more efficient to simply iterate over the underlying array, ie. `da.values.flat`, if you can afford to hold everything in memory.
If you are instead using streaming computation based on dask, then you would have to do something similar on per-chunk basis.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,462049420