id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 484098286,MDU6SXNzdWU0ODQwOTgyODY=,3242,"An `asfreq` method without `resample`, and clarify or improve resample().asfreq() behavior for down-sampling",463809,open,0,,,2,2019-08-22T16:33:32Z,2022-04-18T16:01:07Z,,CONTRIBUTOR,,,,"#### MCVE Code Sample ```python # Your code here >>> import numpy as np >>> import xarray as xr >>> import pandas as pd >>> data = np.random.random(300) # Make a time grid that doesn't start exactly on the hour. >>> time = pd.date_range('2019-01-01', periods=300, freq='T') + pd.Timedelta('3T') >>> time DatetimeIndex(['2019-01-01 00:03:00', '2019-01-01 00:04:00', '2019-01-01 00:05:00', '2019-01-01 00:06:00', '2019-01-01 00:07:00', '2019-01-01 00:08:00', '2019-01-01 00:09:00', '2019-01-01 00:10:00', '2019-01-01 00:11:00', '2019-01-01 00:12:00', ... '2019-01-01 04:53:00', '2019-01-01 04:54:00', '2019-01-01 04:55:00', '2019-01-01 04:56:00', '2019-01-01 04:57:00', '2019-01-01 04:58:00', '2019-01-01 04:59:00', '2019-01-01 05:00:00', '2019-01-01 05:01:00', '2019-01-01 05:02:00'], dtype='datetime64[ns]', length=300, freq='T') >>> da = xr.DataArray(data, dims=['time'], coords={'time': time}) >>> resampled = da.resample(time='H').asfreq() >>> resampled array([0.478601, 0.488425, 0.496322, 0.479256, 0.523395, 0.201718]) Coordinates: * time (time) datetime64[ns] 2019-01-01 ... 2019-01-01T05:00:00 # The value is actually the mean over the time window, eg. the third value is: >>> da.loc['2019-01-01T02:00:00':'2019-01-01T02:59:00'].mean() array(0.496322) ``` #### Expected Output Docs say this: ``` Return values of original object at the new up-sampling frequency; essentially a re-index with new times set to NaN. ``` I suppose this doc is not technically wrong, since upon careful reading, I realize it does not define a behavior for down-sampling. But it's easy to: (1) assume the same behavior (reindexing) for down-sampling and up-sampling and/or (2) expect behavior similar to `df.asfreq()` in pandas. #### Problem Description I would argue for an `asfreq` method without resampling that matches the pandas behavior, which AFAIK, is to reindex starting at the first timestamp, at the specified interval. ``` >>> df = pd.DataFrame(da, index=time) >>> df.asfreq('H') 0 2019-01-01 00:03:00 0.065304 2019-01-01 01:03:00 0.325814 2019-01-01 02:03:00 0.841201 2019-01-01 03:03:00 0.610266 2019-01-01 04:03:00 0.613906 ``` This can currently easily be achieved, so it's not a blocker. ``` >>> da.reindex(time=pd.date_range(da.time[0].values, da.time[-1].values, freq='H')) array([0.065304, 0.325814, 0.841201, 0.610266, 0.613906]) Coordinates: * time (time) datetime64[ns] 2019-01-01T00:03:00 ... 2019-01-01T04:03:00 ``` Why I argue for `asfreq` functionality outside of resampling is that `asfreq(freq)` in pandas is purely a reindex, compared to eg `resample(freq).first()` which would give you a different time index. #### Output of ``xr.show_versions()`` Still on python27, `show_versions` actually throws an exception, because some HDF5 library doesn't have a magic property. I don't think this detail is relevant here though.
``` >>> xr.__version__ u'0.11.3' ```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3242/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue