html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4498#issuecomment-706688398,https://api.github.com/repos/pydata/xarray/issues/4498,706688398,MDEyOklzc3VlQ29tbWVudDcwNjY4ODM5OA==,145117,2020-10-11T11:11:47Z,2020-10-11T11:19:56Z,CONTRIBUTOR,"Thanks for the clarification that this is a real issue not due to just my coding, and the suggestion to solve this elsewhere. For now I just use the fast Pandas version with this code: ```python df_h = ds.to_dataframe().resample(""1H"").mean() # what we want (quickly), but in Pandas form vals = [xr.DataArray(data=df_h[c], dims=['time'], coords={'time':df_h.index}, attrs=ds[c].attrs) for c in df_h.columns] ds_h = xr.Dataset(dict(zip(df_h.columns,vals)), attrs=ds.attrs) ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141 https://github.com/pydata/xarray/issues/4498#issuecomment-706688498,https://api.github.com/repos/pydata/xarray/issues/4498,706688498,MDEyOklzc3VlQ29tbWVudDcwNjY4ODQ5OA==,145117,2020-10-11T11:12:47Z,2020-10-11T11:12:47Z,CONTRIBUTOR,The linked issues refer to `groupby` not `resample` so this could stay open or be closed as a duplicate - I leave it to you to decide. Thank you for the assistance.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141 https://github.com/pydata/xarray/issues/4498#issuecomment-706548763,https://api.github.com/repos/pydata/xarray/issues/4498,706548763,MDEyOklzc3VlQ29tbWVudDcwNjU0ODc2Mw==,145117,2020-10-10T13:23:24Z,2020-10-10T13:23:24Z,CONTRIBUTOR,"The every 4th or 5th lag is not in the creation, it's in the `resample`: ```` #+BEGIN_SRC jupyter-python :kernel ds :session bugreport for i in np.arange(25): start = time.time() ds_r = ds.resample({'time':""1H""}) print('xr', str(time.time() - start)) #+END_SRC #+RESULTS: #+begin_example xr 0.04479050636291504 xr 0.047682762145996094 xr 0.8904871940612793 xr 0.05605506896972656 xr 0.0452876091003418 xr 0.0467374324798584 xr 0.8709239959716797 xr 0.05595755577087402 xr 0.046492576599121094 xr 0.04648017883300781 xr 0.045223236083984375 xr 0.8187246322631836 xr 0.05060911178588867 xr 0.04763054847717285 xr 0.8156075477600098 xr 0.055490970611572266 xr 0.047312259674072266 xr 0.04651069641113281 xr 0.8001837730407715 xr 0.05546212196350098 xr 0.04549074172973633 xr 0.04680013656616211 xr 0.04383039474487305 xr 0.7662224769592285 xr 0.04914355278015137 #+end_example ````","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141 https://github.com/pydata/xarray/issues/4498#issuecomment-706548513,https://api.github.com/repos/pydata/xarray/issues/4498,706548513,MDEyOklzc3VlQ29tbWVudDcwNjU0ODUxMw==,145117,2020-10-10T13:21:19Z,2020-10-10T13:21:19Z,CONTRIBUTOR,"""performance"" is a good tag. My actual use case is a dataset with 500,000 timestamps and 15 variables (10 minute weather station for a decade). In this case, pandas takes 0.03 seconds, and xarray takes 200 seconds. 4 orders of magnitude. Should I change the title to reflect the larger difference in performance? Here is that MWE: ```python import numpy as np import xarray as xr import pandas as pd import time size = 500000 times = pd.date_range('2000-01-01', periods=size, freq=""10Min"") ds = xr.Dataset({ 'foo': xr.DataArray( data = np.random.random(size), dims = ['time'], coords = {'time': times} )}) for v in 'abcdefghijelm': ds[v] = (('time'), np.random.random(size)) start = time.time() ds_r = ds.resample({'time':""1H""}).mean() print('xr', str(time.time() - start)) start = time.time() ds_r = ds.to_dataframe().resample(""1H"").mean() print('pd', str(time.time() - start)) ``` Result: ``` xr 202.2967929840088 pd 0.03381085395812988 ``` The strange thing here is if I drop the `.mean()`'s, most of the time I see what you see. ``` : xr 0.03333306312561035 : pd 0.020237445831298828 ``` But every 4th or 5th time that I run this, I get this: ``` : xr 0.8518760204315186 : pd 0.02686452865600586 ``` This is repeatable. I've Run this code 100s of times now, and every 4th or 5th run it takes 10x. Nothing else is going on on my computer.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141