home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 281185199

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1279#issuecomment-281185199 https://api.github.com/repos/pydata/xarray/issues/1279 281185199 MDEyOklzc3VlQ29tbWVudDI4MTE4NTE5OQ== 1217238 2017-02-20T21:28:37Z 2017-02-20T21:28:37Z MEMBER

Note that I was able to apply the rolling window by converting my variable to a pandas series with to_series(). I then could use panda's own rolling window methods. I guess that when converting to a pandas series the dask array is read in memory?

Yes, this is correct -- we automatically compute dask arrays when converting to pandas, because pandas does not have any notion of lazy arrays.

Note that we currently have two versions of rolling window operations:

  1. Implemented with bottleneck. These are fast, but only work in memory. Something like ghost cells would be necessary to extend them to dask.
  2. Implemented with a nested loop written in Python. These are much slower, both because of the algorithm (time O(dim_size * window_size) instead of time O(dim_size)) and implementation of the inner loop in Python instead of C, but there's no fundamental reason why they shouldn't be able to work for dask arrays basically as is.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  208903781
Powered by Datasette · Queries took 80.064ms · About: xarray-datasette