issues: 1755610168
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1755610168 | I_kwDOAMm_X85opHw4 | 7918 | xarray lazy indexing/loading is not sufficiently documented | 14314623 | open | 0 | 1 | 2023-06-13T20:35:38Z | 2023-06-13T21:29:53Z | CONTRIBUTOR | What is your issue?The default behavior of opening up datasets lazily instead of loading them into memory urgently needs more documentation or more extensive linking of existing docs. I have seen tons of example where the 'laziness' of the loading is not apparent to users. The workflow commonly looks something like this:
1. Open some 'larger-than-memory' dataset, e.g. from a cloud bucket with To start with, the docstring of Up until a chat I had with @TomNicholas today, I honestly did not understand why this feature even existed. His explanation (below) was however very good, and if something similar is not in the docs yet, should probably be added.
I think overall this is a giant pitfall, particularly for xarray beginners, and thus deserves some thought. While I am sure the choices made up to here might have some large functional upsides, I wonder three things:
Happy to work on this, since it is very relevant for many members of projects I work with. I first wanted to check if there is some existing docs that I missed. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/7918/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |