issues: 305702311

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
305702311	MDU6SXNzdWUzMDU3MDIzMTE=	1993	DataArray.rolling().mean() is way slower than it should be	1217238	closed	0			5	2018-03-15T20:10:22Z	2018-03-18T08:56:27Z	2018-03-18T08:56:27Z	MEMBER				Code Sample, a copy-pastable example if possible From @RayPalmerTech in https://github.com/kwgoodman/bottleneck/issues/186: ```python import numpy as np import pandas as pd import time import bottleneck as bn import xarray import matplotlib.pyplot as plt N = 30000200 # Number of datapoints Fs = 30000 # sample rate T=1/Fs # sample period duration = N/Fs # duration in s t = np.arange(0,duration,T) # time vector DATA = np.random.randn(N,)+5np.sin(2np.pi0.01t) # Example noisy sine data and window size w = 330000 def using_bottleneck_mean(data,width): return bn.move_mean(a=data,window=width,min_count = 1) def using_pandas_rolling_mean(data,width): return np.asarray(pd.DataFrame(data).rolling(window=width,center=True,min_periods=1).mean()).ravel() def using_xarray_mean(data,width): return xarray.DataArray(data,dims='x').rolling(x=width,min_periods=1, center=True).mean() start=time.time() A = using_bottleneck_mean(DATA,w) print('Bottleneck: ', time.time()-start, 's') start=time.time() B = using_pandas_rolling_mean(DATA,w) print('Pandas: ',time.time()-start,'s') start=time.time() C = using_xarray_mean(DATA,w) print('Xarray: ',time.time()-start,'s') ``` This results in: `Bottleneck: 0.0867006778717041 s Pandas: 0.563546895980835 s Xarray: 25.133142709732056 s` Somehow xarray is way slower than pandas and bottleneck, even though it's using bottleneck under the hood! Problem description Profiling shows that the majority of time is spent in `xarray.core.rolling.DataArrayRolling._setup_windows`. Monkey-patching that method with a dummy rectifies the issue: `xarray.core.rolling.DataArrayRolling._setup_windows = lambda *args: None` Now we obtain: `Bottleneck: 0.06775331497192383 s Pandas: 0.48262882232666016 s Xarray: 0.1723031997680664 s` The solution is to make setting up windows done lazily (in `__iter__`), instead of doing it in the constructor. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 4.4.96+ machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.2 pandas: 0.22.0 numpy: 1.14.2 scipy: 0.19.1 netCDF4: None h5netcdf: None h5py: 2.7.1 Nio: None zarr: None bottleneck: 1.2.1 cyordereddict: None dask: None distributed: None matplotlib: 2.1.2 cartopy: None seaborn: 0.7.1 setuptools: 36.2.7 pip: 9.0.1 conda: None pytest: None IPython: 5.5.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1993/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

1 row from issues_id in issues_labels
5 rows from issue in issue_comments