home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 851391441

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
851391441 MDU6SXNzdWU4NTEzOTE0NDE= 5115 `to_zarr()` dramatically alters dask graph 6582745 closed 0     4 2021-04-06T12:50:04Z 2022-04-19T09:09:58Z 2022-04-19T03:46:55Z NONE      

What happened: The dask graph before a to_zarr() call differs wildly from the dask graph after a to_zarr() call.

What you expected to happen: I would expect to_zarr() to add layers/dependencies to the graph as normal.

Minimal Complete Verifiable Example:

```python import xarray import dask.array as da from pprint import pprint

if name == "main":

arr = da.ones((2,), chunks=(1,)) + 1

xds = xarray.Dataset({"arr": (("x",), arr)})

pprint(xds.arr.data.__dask_graph__().layers)
pprint(xds.arr.data.__dask_graph__().dependencies)

xds = xds.to_zarr("out.zarr", mode="w", compute=False)

pprint(xds.arr.data__dask_graph__().layers)
pprint(xds.arr.data.__dask_graph__().dependencies)

```

Anything else we need to know?: On my system the above will print the following before the to_zarr() call: ```

layers

{'add-1118924c7d3d06d9d07bcca6afde2c7e': Blockwise<(('ones-76dd1e004518465cc97010eea7a88ebc', ('.0',)), (1, None)) -> add-1118924c7d3d06d9d07bcca6afde2c7e>, 'ones-76dd1e004518465cc97010eea7a88ebc': Blockwise<(('blockwise-create-ones-76dd1e004518465cc97010eea7a88ebc', (0,)),) -> ones-76dd1e004518465cc97010eea7a88ebc>}

deps

{'add-1118924c7d3d06d9d07bcca6afde2c7e': {'ones-76dd1e004518465cc97010eea7a88ebc'}, 'ones-76dd1e004518465cc97010eea7a88ebc': set()} and

layers

Delayed('getattr-bf22b6050bac2d8ef0a78589b04365f3')

deps

{139853652717696: set(), 139853652760176: set(), '_finalize_store-faeab92e-4e8d-4155-a915-cbfe8addae8e': {'store-648c67ef-96d5-11eb-ae7e-fc77746741ed'}, 'getattr-84732ba2ce83b0568edc0dad83f2d611': {'getattr-c0220fc5eded903243bac6f4a8067a7b'}, 'getattr-c0220fc5eded903243bac6f4a8067a7b': {'_finalize_store-faeab92e-4e8d-4155-a915-cbfe8addae8e'}} `` after. This seems a little strange as the layers describingarr` have disappeared. This is problematic when attempting to do any post-processing/optimization/annotation on the graph.

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.8.8 (default, Feb 20 2021, 21:09:14) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 5.3.0-7648-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: None libnetcdf: None xarray: 0.17.0 pandas: 1.2.3 numpy: 1.19.5 scipy: 1.6.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: 2.6.1 cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2021.03.0+49.gf4132551 distributed: 2021.03.0+29.g3b8b97e3 matplotlib: 3.3.4 cartopy: None seaborn: None numbagg: None pint: None setuptools: 54.1.2 pip: 21.0.1 conda: None pytest: 6.2.2 IPython: 7.21.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5115/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 79.364ms · About: xarray-datasette