html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1836#issuecomment-361466652,https://api.github.com/repos/pydata/xarray/issues/1836,361466652,MDEyOklzc3VlQ29tbWVudDM2MTQ2NjY1Mg==,2443309,2018-01-30T03:35:07Z,2018-01-30T03:35:07Z,MEMBER,"I tried the above example with the multiprocessing and distributed schedulers. With the multiprocessing scheduler, I can reproduce the error described above. With the distributed scheduler, no error is encountered.
```Python
In [4]: import xarray as xr
...: import numpy as np
...: import dask.multiprocessing
...:
...: from dask.distributed import Client
...:
...: client = Client()
...: print(client)
...:
...: # Generate dummy data and build xarray dataset
...: mat = np.random.rand(10, 90, 90)
...: ds = xr.Dataset(data_vars={'foo': (('time', 'x', 'y'), mat)})
...:
...: # Write dataset to netcdf without compression
...: ds.to_netcdf('dummy_data_3d.nc')
...: # Write with zlib compersison
...: ds.to_netcdf('dummy_data_3d_with_compression.nc',
...: encoding={'foo': {'zlib': True}})
...: # Write data as int16 with scale factor applied
...: ds.to_netcdf('dummy_data_3d_with_scale_factor.nc',
...: encoding={'foo': {'dtype': 'int16',
...: 'scale_factor': 0.01,
...: '_FillValue': -9999}})
...:
...: # Load data from netCDF files
...: ds_vanilla = xr.open_dataset('dummy_data_3d.nc', chunks={'time': 1})
...: ds_scaled = xr.open_dataset('dummy_data_3d_with_scale_factor.nc', chunks={'time': 1})
...: ds_compressed = xr.open_dataset('dummy_data_3d_with_compression.nc', chunks={'time': 1})
...:
...: # Do computation using dask's multiprocessing scheduler
...: foo = ds_vanilla.foo.mean(dim=['x', 'y']).compute()
...: foo = ds_scaled.foo.mean(dim=['x', 'y']).compute()
...: foo = ds_compressed.foo.mean(dim=['x', 'y']).compute()
```
----
I personally don't have any use cases that would prefer the multiprocessing scheduler over the distributed scheduler but I have been working on improving the I/O performance and stability with xarray and dask lately. If anyone would like to work on this, I'd gladly help this get cleaned up or put a more definitive no on whether or not this can/should work.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,289342234