html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1224#issuecomment-274380660,https://api.github.com/repos/pydata/xarray/issues/1224,274380660,MDEyOklzc3VlQ29tbWVudDI3NDM4MDY2MA==,1217238,2017-01-23T02:04:23Z,2017-01-23T02:04:23Z,MEMBER,"Was concat slow at graph construction or compute time?
On Sun, Jan 22, 2017 at 6:02 PM crusaderky wrote:
> (arrays * weights).sum('stacked') was my first attempt. It performed
> considerably worse than sum(a * w for a, w in zip(arrays, weights)) -
> mostly because xarray.concat() is not terribly performant (I did not look
> deeper into it).
>
> I did not try dask.array.sum() - worth some playing with.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,202423683
https://github.com/pydata/xarray/issues/1224#issuecomment-274375810,https://api.github.com/repos/pydata/xarray/issues/1224,274375810,MDEyOklzc3VlQ29tbWVudDI3NDM3NTgxMA==,1217238,2017-01-23T01:09:49Z,2017-01-23T01:09:49Z,MEMBER,"Interesting -- thanks for sharing! I am interested in performance improvements but also a little reluctant to add in specialized optimizations directly into xarray.
You write that this is equivalent to `sum(a * w for a, w in zip(arrays, weights))`. How does this compare to stacking doing the sum in xarray, e.g.,`(arrays * weights).sum('stacked')`, where `arrays` and `weights` are now DataArray objects with a `'stacked'` dimension? Or maybe `arrays.dot(weights)`?
Using vectorized operations feels a bit more idiomatic (though also maybe more verbose). It also may be more performant. Note that the builtin `sum` is not optimized well by dask because it's basically equivalent to a loop:
```
def sum(xs):
result = 0
for x in xs:
result += x
return result
```
In contrast, `dask.array.sum` builds up a tree so it can do the sum in parallel.
There have also been discussion in https://github.com/pydata/xarray/issues/422 about adding a dedicated method for weighted mean.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,202423683