home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 304021813

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
304021813 MDU6SXNzdWUzMDQwMjE4MTM= 1978 Efficient rolling 'trick' 5635139 closed 0     4 2018-03-10T00:29:33Z 2018-03-10T01:23:06Z 2018-03-10T01:23:06Z MEMBER      

Based off http://www.rigtorp.se/2011/01/01/rolling-statistics-numpy.html, we wrote up a function that 'tricks' numpy into presenting an array that looks rolling, but without the O^2 memory requirements

Would people be interested in this going into xarray?

It seems to work really well on a few use-cases, but I imagine it's enough trickery that we might not want to support it in xarray. And, to be clear, it's strictly worse where we have rolling algos. But where we don't, you get a rolling apply without the python loops.

```python

def rolling_window_numpy(a, window): """ Make an array appear to be rolling, but using only a view http://www.rigtorp.se/2011/01/01/rolling-statistics-numpy.html """ shape = a.shape[:-1] + (a.shape[-1] - window + 1, window) strides = a.strides + (a.strides[-1],) return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def rolling_window(da, span, dim=None, new_dim='dim_0'): """ Adds a rolling dimension to a DataArray using only a view """ original_dims = da.dims da = da.transpose(*tuple(d for d in da.dims if d != dim) + (dim,))

result = apply_ufunc(
    rolling_window_numpy,
    da,
    output_core_dims=((new_dim,),),
    kwargs=(dict(window=span)))

return result.transpose(*(original_dims + (new_dim,)))

tests

import numpy as np import pandas as pd import pytest import xarray as xr

@pytest.fixture def da(dims): return xr.DataArray( np.random.rand(5, 10, 15), dims=(list('abc'))).transpose(*dims)

@pytest.fixture(params=[ list('abc'), list('bac'), list('cab'), ]) def dims(request): return request.param

def test_iterate_imputation_fills_missing(sample_data): sample_data.iloc[2, 2] = pd.np.nan result = iterate_imputation(sample_data) assert result.shape == sample_data.shape assert result.notnull().values.all()

def test_rolling_window(da, dims):

result = rolling_window(da, 3, dim='c', new_dim='x')

assert result.transpose(*list('abcx')).shape == (5, 10, 13, 3)

# should be a view, so doesn't have any larger strides
assert np.max(result.values.strides) == 10 * 15 * 8

def test_rolling_window_values():

da = xr.DataArray(np.arange(12).reshape(2, 6), dims=('item', 'date'))

rolling = rolling_window(da, 3, dim='date', new_dim='rolling_date')

expected = sum([11, 10, 9])
result = rolling.sum('rolling_date').isel(item=1, date=-1)
assert result == expected

```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1978/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 6.248ms · About: xarray-datasette