html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6084#issuecomment-1011450955,https://api.github.com/repos/pydata/xarray/issues/6084,1011450955,IC_kwDOAMm_X848SYRL,1217238,2022-01-12T21:05:59Z,2022-01-12T21:05:59Z,MEMBER,"> E.g., I _think_ skipping [this line](https://github.com/pydata/xarray/blob/6a29380008dcd790f9adfbc290affcb767c913b2/xarray/backends/api.py#L1439) would save some of the users in my original post a lot of time.
I don't think that line adds any measurable overhead. It's just telling dask to delay computation of a single function.
For sure this would be worth elaborating on in the Xarray docs! I wrote a little bit about this in the docs for Xarray-Beam: see ""One recommended pattern"" in https://xarray-beam.readthedocs.io/en/latest/read-write.html#writing-data-to-zarr","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1083621690
https://github.com/pydata/xarray/issues/6084#issuecomment-1011430150,https://api.github.com/repos/pydata/xarray/issues/6084,1011430150,IC_kwDOAMm_X848STMG,42455466,2022-01-12T20:35:44Z,2022-01-12T20:35:44Z,NONE,"Thanks @shoyer. I understand the need for the schema, but is there a need to actually generate the dask graph when all the user wants to do is initialise an empty zarr store? E.g., I *think* skipping [this line](https://github.com/pydata/xarray/blob/6a29380008dcd790f9adfbc290affcb767c913b2/xarray/backends/api.py#L1439) would save some of the users in my original post a lot of time.
Regardless, your suggestion to just create a low-overhead version of the array being initialised is probably better/cleaner than adding a specific option or method. Would it be worth adding the `xarray.zeros_like(ds)` recommendation to the [docs](https://xarray.pydata.org/en/stable/user-guide/io.html#appending-to-existing-zarr-stores)?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1083621690
https://github.com/pydata/xarray/issues/6084#issuecomment-1000628813,https://api.github.com/repos/pydata/xarray/issues/6084,1000628813,IC_kwDOAMm_X847pGJN,2448579,2021-12-24T03:17:44Z,2021-12-24T03:17:44Z,MEMBER,What metadata is being determined by computing the whole array?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1083621690
https://github.com/pydata/xarray/issues/6084#issuecomment-998357641,https://api.github.com/repos/pydata/xarray/issues/6084,998357641,IC_kwDOAMm_X847gbqJ,1217238,2021-12-21T00:00:49Z,2021-12-21T00:00:49Z,MEMBER,"The challenge is that Xarray needs _some_ way to represent the ""schema"" for the desired entire dataset. I'm very open to alternatives, but so far, the most convenient way to do this has been to load Dask arrays into an xarray.Dataset.
It's worth noting that any dask arrays with the desired chunking scheme will do -- you don't need to use the same dask arrays that you want to compute. When I do this sort of thing, I will often use `xarray.zeros_like()` to create low overhead versions of dask arrays, e.g., in this example from Xarray-Beam:
https://github.com/google/xarray-beam/blob/0.2.0/examples/era5_climatology.py#L61-L68","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1083621690