html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4209#issuecomment-656403217,https://api.github.com/repos/pydata/xarray/issues/4209,656403217,MDEyOklzc3VlQ29tbWVudDY1NjQwMzIxNw==,2448579,2020-07-09T23:43:17Z,2020-07-09T23:43:17Z,MEMBER,"Here's an alternative `map_blocks` solution:
``` python
def write_block(ds, t0):
if len(ds.time) > 0:
fname = (ds.time[0] - t0).values.astype(""timedelta64[h]"").astype(int)
ds.to_netcdf(f""temp/file-{fname:06d}.nc"")
# dummy return
return ds.time
ds = xr.tutorial.open_dataset(""air_temperature"", chunks={""time"": 100})
ds.map_blocks(write_block, kwargs=dict(t0=ds.time[0])).compute(scheduler=""processes"")
```
There are two workarounds here though.
1. The user function always has to return something.
2. We can't provide `template=ds.time` because it has no chunk information and `ds.time.chunk({""time"": 100})` silently does nothing because it is an IndexVariable. So the user function still needs the `len(ds.time) > 0` workaround.
I think a cleaner API may be to have `dask.compute([write_block(block) for block in ds.to_delayed()])` where `ds.to_delayed()` yields a bunch of tasks; each of which gives a Dataset wrapping one block of the underlying array.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653442225
https://github.com/pydata/xarray/issues/4209#issuecomment-656399210,https://api.github.com/repos/pydata/xarray/issues/4209,656399210,MDEyOklzc3VlQ29tbWVudDY1NjM5OTIxMA==,1217238,2020-07-09T23:28:30Z,2020-07-09T23:28:30Z,MEMBER,"> > The way `compute=False` currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.
>
> Interesting... I always assumed that all operations (including file creation) were delayed. So, this is a feature and not a bug then?
Well, there is certainly a case for file creation also being lazy -- it definitely is more intuitive! This was more of an oversight than an intentional omission. Metadata generally needs to be written from a single process, anyways, so we never got around to doing it with Dask.
That said, there are also some legitimate use cases where it is nice to be able to eagerly write metadata only without any array data. This is what we were proposing to do with `compute=False` in `to_zarr`:
https://github.com/pydata/xarray/pull/4035","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653442225
https://github.com/pydata/xarray/issues/4209#issuecomment-656395121,https://api.github.com/repos/pydata/xarray/issues/4209,656395121,MDEyOklzc3VlQ29tbWVudDY1NjM5NTEyMQ==,13301940,2020-07-09T23:14:15Z,2020-07-09T23:14:15Z,MEMBER,"> The way `compute=False` currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.
Interesting... I always assumed that all operations (including file creation) were delayed. So, this is a feature and not a bug then?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653442225
https://github.com/pydata/xarray/issues/4209#issuecomment-656388313,https://api.github.com/repos/pydata/xarray/issues/4209,656388313,MDEyOklzc3VlQ29tbWVudDY1NjM4ODMxMw==,1217238,2020-07-09T22:50:16Z,2020-07-09T22:50:16Z,MEMBER,"The way `compute=False` currently works may be a little confusing. It doesn't actually delay creating files, it just delays writing the array data.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,653442225