html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6803#issuecomment-1280764797,https://api.github.com/repos/pydata/xarray/issues/6803,1280764797,IC_kwDOAMm_X85MVut9,6213168,2022-10-17T12:15:36Z,2022-10-17T12:20:02Z,MEMBER,"```python new_data_future = xr.apply_ufunc( _copy_test, data, a_x, ... ) ``` *instead* of using kwargs. I've opened https://github.com/dask/distributed/issues/7140 to simplify this. With it implemented, my snippet ```python test = np.full((20,), 30) a = da.from_array(test) dsk = client.scatter(dict(a.dask), broadcast=True) a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape) a_x = xarray.DataArray(a, dims=[""new_z""]) ``` would become ```python test = np.full((20,), 30) a_x = xarray.DataArray(test, dims=[""new_z""]).chunk() a_x = client.scatter(a_x) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1280746923,https://api.github.com/repos/pydata/xarray/issues/6803,1280746923,IC_kwDOAMm_X85MVqWr,6213168,2022-10-17T12:01:17Z,2022-10-17T12:01:17Z,MEMBER,"Having said the above, your design is... contrived. There isn't, as of today, a straightforward way to scatter a local dask collection (`persist()` will push the whole thing through the scheduler and likely send it out of memory). Workaround: ```python test = np.full((20,), 30) a = da.from_array(test) dsk = client.scatter(dict(a.dask), broadcast=True) a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape) a_x = xarray.DataArray(a, dims=[""new_z""]) ``` Once you have a_x, you just pass it to the args (not kwargs) of apply_ufunc.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1280729879,https://api.github.com/repos/pydata/xarray/issues/6803,1280729879,IC_kwDOAMm_X85MVmMX,6213168,2022-10-17T11:45:31Z,2022-10-17T11:45:31Z,MEMBER,"> This is still an issue. I noticed that the documentation of `map_blocks` states: **kwargs** ([mapping](https://docs.python.org/3/glossary.html#term-mapping)) – Passed verbatim to func after unpacking. xarray objects, if any, will not be subset to blocks. _Passing dask collections in kwargs is not allowed_. > > Is this the case for `apply_ufunc` as well? test_future is not a dask collection. It's a distributed.Future, which points to an arbitrary, opaque data blob that xarray has no means to know about. FWIW, I could reproduce the issue, where the future in the kwargs is not resolved to the data it points to as one would expect. Minimal reproducer: ```python import distributed import xarray client = distributed.Client(processes=False) x = xarray.DataArray([1, 2]).chunk() test_future = client.scatter(""Hello World"") def f(d, test): print(test) return d y = xarray.apply_ufunc( f, x, dask='parallelized', output_dtypes=""float64"", kwargs={'test':test_future}, ) y.compute() ``` Expected print output: `Hello World` Actual print output: ` `","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148