html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/1811#issuecomment-382466626,https://api.github.com/repos/pydata/xarray/issues/1811,382466626,MDEyOklzc3VlQ29tbWVudDM4MjQ2NjYyNg==,1872600,2018-04-18T17:30:25Z,2018-04-18T17:32:21Z,NONE,"@jhamman, I was just using `client = Client()`.   Should I be using `LocalCluster` instead?  
(there is no kubernetes on this JupyterHub).   
Also, is there a better place to have this sort of discussion or is it okay here?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,286542795
https://github.com/pydata/xarray/pull/1811#issuecomment-382421609,https://api.github.com/repos/pydata/xarray/issues/1811,382421609,MDEyOklzc3VlQ29tbWVudDM4MjQyMTYwOQ==,1872600,2018-04-18T15:11:02Z,2018-04-18T15:14:12Z,NONE,"@jhamman, I tried the same code with a single-threaded scheduler:
```python
    ...
    delayed_store = ds.to_zarr(store=d, mode='w', encoding=encoding, compute=False)
    persist_store = delayed_store.persist(retries=100, get=dask.local.get_sync)
```
and it ran to completion with no errors (taking 2 hours for 100GB to Zarr).    What should I try next?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,286542795
https://github.com/pydata/xarray/pull/1811#issuecomment-381969631,https://api.github.com/repos/pydata/xarray/issues/1811,381969631,MDEyOklzc3VlQ29tbWVudDM4MTk2OTYzMQ==,1872600,2018-04-17T12:12:15Z,2018-04-17T12:15:19Z,NONE,"@jhamman , I'm trying to test `compute=False` out this code:
```python
# Write National Water Model data to Zarr

from dask.distributed import Client
import pandas as pd
import xarray as xr
import s3fs
import zarr

if __name__ == '__main__':

    client = Client()

    root = '/projects/water/nwm/data/forcing_short_range/'                      # Local Files
#   root = 'http://tds.renci.org:8080/thredds/dodsC/nwm/forcing_short_range/'   # OPenDAP


    bucket_endpoint='https://s3.us-west-1.amazonaws.com/'
#   bucket_endpoint='https://iu.jetstream-cloud.org:8080'

    f_zarr = 'rsignell/nwm/test_week'

    dates = pd.date_range(start='2018-04-01T00:00', end='2018-04-07T23:00', freq='H')
    urls = ['{}{}/nwm.t{}z.short_range.forcing.f001.conus.nc'.format(root,a.strftime('%Y%m%d'),a.strftime('%H')) for a in dates]

    ds = xr.open_mfdataset(urls, concat_dim='time', lock=True)
    ds = ds.drop(['ProjectionCoordinateSystem'])

    fs = s3fs.S3FileSystem(anon=False, client_kwargs=dict(endpoint_url=bucket_endpoint))
    d = s3fs.S3Map(f_zarr, s3=fs)

    compressor = zarr.Blosc(cname='zstd', clevel=3, shuffle=2)
    encoding = {vname: {'compressor': compressor} for vname in ds.data_vars}

    delayed_store = ds.to_zarr(store=d, mode='w', encoding=encoding, compute=False)
    persist_store = delayed_store.persist(retries=100)
```
and after 20 seconds or so, the process dies with this error:

```python-traceback
/home/rsignell/my-conda-envs/zarr/lib/python3.6/site-packages/distributed/worker.py:742: 
UserWarning: Large object of size 1.23 MB detected in task graph:

  (<xarray.backends.zarr.ZarrStore object at 0x7f5d8 ... deedecefab224')

Consider scattering large objects ahead of time with client.scatter
to reduce scheduler burden and keep data on workers

    future = client.submit(func, big_data)    # bad

    big_future = client.scatter(big_data)     # good
    future = client.submit(func, big_future)  # good
  % (format_bytes(len(b)), s))
```
Do you have suggestions on how to modify my code?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,286542795