id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 182667672,MDU6SXNzdWUxODI2Njc2NzI=,1046,center=True for xarray.DataArray.rolling(),5572303,open,0,,,8,2016-10-13T00:37:25Z,2024-04-04T21:06:57Z,,CONTRIBUTOR,,,,"The logic behind setting center=True confuses me. Say window size = 3. The default behavior (center=False) sets the window to go from i-2 to i, so I would've expected center=True to set the window from i-1 to i+1. But that's not what I see. For example, this is what data looks like: ``` >>> data = xr.DataArray(np.arange(27).reshape(3, 3, 3), coords=[('x', ['a', 'b', 'c']), ('y', [-2, 0, 2]), ('z', [0, 1 ,2])]) >>>data xarray.DataArray (x: 3, y: 3, z: 3), array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ``` Now, if I set y-window size = 3, center = False, min # of entries = 1, I get ``` >>> r = data.rolling(y=3, center=False, min_periods=1) >>> r.mean() array([[[ 0. , 1. , 2. ], [ 1.5, 2.5, 3.5], [ 3. , 4. , 5. ]], [[ 9. , 10. , 11. ], [ 10.5, 11.5, 12.5], [ 12. , 13. , 14. ]], [[ 18. , 19. , 20. ], [ 19.5, 20.5, 21.5], [ 21. , 22. , 23. ]]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ``` Which essentially gives me a ""trailing window"" of size 3, meaning the window goes from i-2 to i. This is not explained in the doc but can be understood empirically. On the other hand, setting center = True gives ``` >>> r = data.rolling(y=3, center=True, min_periods=1) >>> r.mean() array([[[ 1.5, 2.5, 3.5], [ 3. , 4. , 5. ], [ nan, nan, nan]], [[ 10.5, 11.5, 12.5], [ 12. , 13. , 14. ], [ nan, nan, nan]], [[ 19.5, 20.5, 21.5], [ 21. , 22. , 23. ], [ nan, nan, nan]]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 -2 0 2 * z (z) int64 0 1 2 ``` In other words, it just pushes every cell up the y-dim by 1, using nan to represent things coming off the edge of the universe. If you look at _center_result() of xarray/core/rolling.py, that's exactly what it does with .shift(). I would've expected center=True to change the window to go from i-1 to i+1. In which case, with min_periods=1, would not render any nan value in r.mean(). Could someone explain the logical flow to me? Much obliged, Chun ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1046/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 221366244,MDU6SXNzdWUyMjEzNjYyNDQ=,1371,Weighted quantile,5572303,open,0,,,8,2017-04-12T19:29:04Z,2019-03-20T22:34:22Z,,CONTRIBUTOR,,,,"For our work we frequently need to compute weighted quantiles. This is especially important when we need to weigh data from recent years more heavily in making predictions. I've put together a function (called `weighted_quantile`) largely based on the source code of `np.percentile`. It allows one to input weights along a single dimension, as a dict `w_dict`. Below are some manual tests: When all weights = 1, it's identical to using `np.nanpercentile`: ``` >>> ar0 array([[3, 4, 8, 1], [5, 3, 7, 9], [4, 9, 6, 2]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * y (y) int64 0 1 2 3 >>> ar0.quantile(q=[0.25, 0.5, 0.75], dim='y') array([[ 2.5 , 4.5 , 3.5 ], [ 3.5 , 6. , 5. ], [ 5. , 7.5 , 6.75]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 >>> weighted_quantile(da=ar0, q=[0.25, 0.5, 0.75], dim='y', w_dict={'y': [1,1,1,1]}) array([[ 2.5 , 4.5 , 3.5 ], [ 3.5 , 6. , 5. ], [ 5. , 7.5 , 6.75]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 ``` Now different weights: ``` >>> weighted_quantile(da=ar0, q=[0.25, 0.5, 0.75], dim='y', w_dict={'y': [1,2,3,4.0]}) array([[ 3.25 , 5.666667, 4.333333], [ 4. , 7. , 5.333333], [ 6. , 8. , 6.75 ]]) Coordinates: * x (x) |S1 'a' 'b' 'c' * quantile (quantile) float64 0.25 0.5 0.75 ``` Also handles nan values like `np.nanpercentile`: ``` >>> ar array([[[ nan, 3.], [ nan, 5.]], [[ 8., 1.], [ nan, 0.]]]) Coordinates: * x (x) |S1 'a' 'b' * y (y) int64 0 1 * z (z) int64 8 9 >>> da_stacked = ar.stack(mi=['x', 'y']) >>> out = weighted_quantile(da=ar, q=[0.25, 0.5, 0.75], dim=['x', 'y'], w_dict={'x': [1, 1]}) >>> out array([[ 8. , 0.75], [ 8. , 2. ], [ 8. , 3.5 ]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 >>> da_stacked.quantile(q=[0.25, 0.5, 0.75], dim='mi') array([[ 8. , 0.75], [ 8. , 2. ], [ 8. , 3.5 ]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 ``` Lastly, different interpolation schemes are consistent: ``` >>> out = weighted_quantile(da=ar, q=[0.25, 0.5, 0.75], dim=['x', 'y'], w_dict={'x': [1, 1]}, interpolation='nearest') >>> out array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 >>> da_stacked.quantile(q=[0.25, 0.5, 0.75], dim='mi', interpolation='nearest') array([[ 8., 1.], [ 8., 3.], [ 8., 3.]]) Coordinates: * z (z) int64 8 9 * quantile (quantile) float64 0.25 0.5 0.75 ``` We wonder if it's ok to make this part of xarray. If so, the most logical place to implement it would seem to be in `Variable.quantile()`. Another option is to make it a utility function, to be called as `xr.weighted_quantile()`.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1371/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue