home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 218315793

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
218315793 MDU6SXNzdWUyMTgzMTU3OTM= 1344 Dask Persist 306380 closed 0     5 2017-03-30T20:19:17Z 2017-04-04T16:14:17Z 2017-04-04T16:14:17Z MEMBER      

It would be convenient to load constituent dask.arrays into memory as dask.arrays rather than as numpy arrays. This would help with distributed computations where we want to load a large amount of data into distributed memory once and then iterate on the full xarray dataset repeatedly without reloading from disk every time.

We can probably solve this from either side:

  1. XArray could make a .persist method that replaced all of its dask.arrays with a persisted version of that array

```python import dask

dset.x, dset.y, dset.z = dask.persist(dset.x, dset.y, dset.z) ```

  1. We could look into the Dask duck type solution again https://github.com/dask/dask/pull/1068

cc @shoyer @jcrist @rabernat @pwolfram

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/1344/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 5 rows from issue in issue_comments
Powered by Datasette · Queries took 155.868ms · About: xarray-datasette