issues: 305702311
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
305702311 | MDU6SXNzdWUzMDU3MDIzMTE= | 1993 | DataArray.rolling().mean() is way slower than it should be | 1217238 | closed | 0 | 5 | 2018-03-15T20:10:22Z | 2018-03-18T08:56:27Z | 2018-03-18T08:56:27Z | MEMBER | Code Sample, a copy-pastable example if possibleFrom @RayPalmerTech in https://github.com/kwgoodman/bottleneck/issues/186: ```python import numpy as np import pandas as pd import time import bottleneck as bn import xarray import matplotlib.pyplot as plt N = 30000200 # Number of datapoints Fs = 30000 # sample rate T=1/Fs # sample period duration = N/Fs # duration in s t = np.arange(0,duration,T) # time vector DATA = np.random.randn(N,)+5np.sin(2np.pi0.01t) # Example noisy sine data and window size w = 330000 def using_bottleneck_mean(data,width): return bn.move_mean(a=data,window=width,min_count = 1) def using_pandas_rolling_mean(data,width): return np.asarray(pd.DataFrame(data).rolling(window=width,center=True,min_periods=1).mean()).ravel() def using_xarray_mean(data,width): return xarray.DataArray(data,dims='x').rolling(x=width,min_periods=1, center=True).mean() start=time.time() A = using_bottleneck_mean(DATA,w) print('Bottleneck: ', time.time()-start, 's') start=time.time() B = using_pandas_rolling_mean(DATA,w) print('Pandas: ',time.time()-start,'s') start=time.time() C = using_xarray_mean(DATA,w) print('Xarray: ',time.time()-start,'s') ``` This results in:
Somehow xarray is way slower than pandas and bottleneck, even though it's using bottleneck under the hood! Problem descriptionProfiling shows that the majority of time is spent in Now we obtain:
The solution is to make setting up windows done lazily (in Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1993/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |