issues: 2136709010
This data as json
| id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type | 
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2136709010 | I_kwDOAMm_X85_W5eS | 8753 | Lazy Loading with `DataArray` vs. `Variable` | 2448579 | closed | 0 | 0 | 2024-02-15T14:42:24Z | 2024-04-04T16:46:54Z | 2024-04-04T16:46:54Z | MEMBER | Discussed in https://github.com/pydata/xarray/discussions/8751
<sup>Originally posted by **ilan-gold** February 15, 2024</sup>
My goal is to get a dataset from [custom io-zarr backend lazy-loaded](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html#how-to-support-lazy-loading).  But when I declare a `DataArray` based on the `Variable` which uses `LazilyIndexedArray`, everything is read in.  Is this expected?  I specifically don't want to have to use dask if possible.  I have seen https://github.com/aurghs/xarray-backend-tutorial/blob/main/2.Backend_with_Lazy_Loading.ipynb but it's a little bit different.
While I have a custom backend array inheriting from `ZarrArrayWrapper`, this example using `ZarrArrayWrapper` directly still highlights the same unexpected behavior of everything being read in. 
```python
import zarr
import xarray as xr
from tempfile import mkdtemp
import numpy as np
from pathlib import Path
from collections import defaultdict
class AccessTrackingStore(zarr.DirectoryStore):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._access_count = {}
        self._accessed = defaultdict(set)
    def __getitem__(self, key):
        for tracked in self._access_count:
            if tracked in key:
                self._access_count[tracked] += 1
                self._accessed[tracked].add(key)
        return super().__getitem__(key)
    def get_access_count(self, key):
        return self._access_count[key]
    def set_key_trackers(self, keys_to_track):
        if isinstance(keys_to_track, str):
            keys_to_track = [keys_to_track]
        for k in keys_to_track:
            self._access_count[k] = 0
    def get_subkeys_accessed(self, key):
        return self._accessed[key]
orig_path = Path(mkdtemp())
z = zarr.group(orig_path / "foo.zarr")
z['array'] = np.random.randn(1000, 1000)
store = AccessTrackingStore(orig_path / "foo.zarr")
store.set_key_trackers(['array'])
z = zarr.group(store)
arr = xr.backends.zarr.ZarrArrayWrapper(z['array'])
lazy_arr = xr.core.indexing.LazilyIndexedArray(arr)
# just `.zarray`
var = xr.Variable(('x', 'y'), lazy_arr)
print('Variable read in ', store.get_subkeys_accessed('array'))
# now everything is read in
da = xr.DataArray(var)
print('DataArray read in ', store.get_subkeys_accessed('array'))
```  | 
                
                    {
    "url": "https://api.github.com/repos/pydata/xarray/issues/8753/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
} | 
                
                    completed | 13221727 | issue |