html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4498#issuecomment-706688398,https://api.github.com/repos/pydata/xarray/issues/4498,706688398,MDEyOklzc3VlQ29tbWVudDcwNjY4ODM5OA==,145117,2020-10-11T11:11:47Z,2020-10-11T11:19:56Z,CONTRIBUTOR,"Thanks for the clarification that this is a real issue not due to just my coding, and the suggestion to solve this elsewhere. For now I just use the fast Pandas version with this code:
```python
df_h = ds.to_dataframe().resample(""1H"").mean() # what we want (quickly), but in Pandas form
vals = [xr.DataArray(data=df_h[c], dims=['time'], coords={'time':df_h.index}, attrs=ds[c].attrs) for c in df_h.columns]
ds_h = xr.Dataset(dict(zip(df_h.columns,vals)), attrs=ds.attrs)
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141
https://github.com/pydata/xarray/issues/4498#issuecomment-706688498,https://api.github.com/repos/pydata/xarray/issues/4498,706688498,MDEyOklzc3VlQ29tbWVudDcwNjY4ODQ5OA==,145117,2020-10-11T11:12:47Z,2020-10-11T11:12:47Z,CONTRIBUTOR,The linked issues refer to `groupby` not `resample` so this could stay open or be closed as a duplicate - I leave it to you to decide. Thank you for the assistance.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141
https://github.com/pydata/xarray/issues/4498#issuecomment-706548763,https://api.github.com/repos/pydata/xarray/issues/4498,706548763,MDEyOklzc3VlQ29tbWVudDcwNjU0ODc2Mw==,145117,2020-10-10T13:23:24Z,2020-10-10T13:23:24Z,CONTRIBUTOR,"The every 4th or 5th lag is not in the creation, it's in the `resample`:
````
#+BEGIN_SRC jupyter-python :kernel ds :session bugreport
for i in np.arange(25):
start = time.time()
ds_r = ds.resample({'time':""1H""})
print('xr', str(time.time() - start))
#+END_SRC
#+RESULTS:
#+begin_example
xr 0.04479050636291504
xr 0.047682762145996094
xr 0.8904871940612793
xr 0.05605506896972656
xr 0.0452876091003418
xr 0.0467374324798584
xr 0.8709239959716797
xr 0.05595755577087402
xr 0.046492576599121094
xr 0.04648017883300781
xr 0.045223236083984375
xr 0.8187246322631836
xr 0.05060911178588867
xr 0.04763054847717285
xr 0.8156075477600098
xr 0.055490970611572266
xr 0.047312259674072266
xr 0.04651069641113281
xr 0.8001837730407715
xr 0.05546212196350098
xr 0.04549074172973633
xr 0.04680013656616211
xr 0.04383039474487305
xr 0.7662224769592285
xr 0.04914355278015137
#+end_example
````","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141
https://github.com/pydata/xarray/issues/4498#issuecomment-706548513,https://api.github.com/repos/pydata/xarray/issues/4498,706548513,MDEyOklzc3VlQ29tbWVudDcwNjU0ODUxMw==,145117,2020-10-10T13:21:19Z,2020-10-10T13:21:19Z,CONTRIBUTOR,"""performance"" is a good tag. My actual use case is a dataset with 500,000 timestamps and 15 variables (10 minute weather station for a decade).
In this case, pandas takes 0.03 seconds, and xarray takes 200 seconds. 4 orders of magnitude. Should I change the title to reflect the larger difference in performance? Here is that MWE:
```python
import numpy as np
import xarray as xr
import pandas as pd
import time
size = 500000
times = pd.date_range('2000-01-01', periods=size, freq=""10Min"")
ds = xr.Dataset({
'foo': xr.DataArray(
data = np.random.random(size),
dims = ['time'],
coords = {'time': times}
)})
for v in 'abcdefghijelm':
ds[v] = (('time'), np.random.random(size))
start = time.time()
ds_r = ds.resample({'time':""1H""}).mean()
print('xr', str(time.time() - start))
start = time.time()
ds_r = ds.to_dataframe().resample(""1H"").mean()
print('pd', str(time.time() - start))
```
Result:
```
xr 202.2967929840088
pd 0.03381085395812988
```
The strange thing here is if I drop the `.mean()`'s, most of the time I see what you see.
```
: xr 0.03333306312561035
: pd 0.020237445831298828
```
But every 4th or 5th time that I run this, I get this:
```
: xr 0.8518760204315186
: pd 0.02686452865600586
```
This is repeatable. I've Run this code 100s of times now, and every 4th or 5th run it takes 10x. Nothing else is going on on my computer.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,718436141