issues: 58310637
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
58310637 | MDU6SXNzdWU1ODMxMDYzNw== | 328 | Support out-of-core computation using dask | 1217238 | closed | 0 | 987654 | 7 | 2015-02-20T05:02:22Z | 2015-04-17T21:03:12Z | 2015-04-17T21:03:12Z | MEMBER | Dask is a library for out of core computation somewhat similar to biggus in conception, but with slightly grander aspirations. For examples of how Dask could be applied to weather data, see this blog post by @mrocklin: http://matthewrocklin.com/blog/work/2015/02/13/Towards-OOC-Slicing-and-Stacking/ It would be interesting to explore using dask internally in xray, so that we can implement lazy/out-of-core aggregations, concat and groupby to complement the existing lazy indexing. This functionality would be quite useful for xray, and even more so than merely supporting datasets-on-disk (#199). A related issue is #79: we can easily imagine using Dask with groupby/apply to power out-of-core and multi-threaded computation. Todos for xray:
- [x] refactor Todos for dask (to be clear, none of these are blockers for a proof of concept):
- [x] support for NaN skipping aggregations
- [x] ~~support for interleaved concatenation (necessary for transformations by group, which are quite common)~~ (turns out to be a one-liner with concatenate and take, see below)
- [x] ~~support for something like |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/328/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |