html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6803#issuecomment-1280764797,https://api.github.com/repos/pydata/xarray/issues/6803,1280764797,IC_kwDOAMm_X85MVut9,6213168,2022-10-17T12:15:36Z,2022-10-17T12:20:02Z,MEMBER,"```python
new_data_future = xr.apply_ufunc(
_copy_test,
data,
a_x,
...
)
```
*instead* of using kwargs.
I've opened https://github.com/dask/distributed/issues/7140 to simplify this. With it implemented, my snippet
```python
test = np.full((20,), 30)
a = da.from_array(test)
dsk = client.scatter(dict(a.dask), broadcast=True)
a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape)
a_x = xarray.DataArray(a, dims=[""new_z""])
```
would become
```python
test = np.full((20,), 30)
a_x = xarray.DataArray(test, dims=[""new_z""]).chunk()
a_x = client.scatter(a_x)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148
https://github.com/pydata/xarray/issues/6803#issuecomment-1280746923,https://api.github.com/repos/pydata/xarray/issues/6803,1280746923,IC_kwDOAMm_X85MVqWr,6213168,2022-10-17T12:01:17Z,2022-10-17T12:01:17Z,MEMBER,"Having said the above, your design is... contrived.
There isn't, as of today, a straightforward way to scatter a local dask collection (`persist()` will push the whole thing through the scheduler and likely send it out of memory).
Workaround:
```python
test = np.full((20,), 30)
a = da.from_array(test)
dsk = client.scatter(dict(a.dask), broadcast=True)
a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape)
a_x = xarray.DataArray(a, dims=[""new_z""])
```
Once you have a_x, you just pass it to the args (not kwargs) of apply_ufunc.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148
https://github.com/pydata/xarray/issues/6803#issuecomment-1280729879,https://api.github.com/repos/pydata/xarray/issues/6803,1280729879,IC_kwDOAMm_X85MVmMX,6213168,2022-10-17T11:45:31Z,2022-10-17T11:45:31Z,MEMBER,"> This is still an issue. I noticed that the documentation of `map_blocks` states: **kwargs** ([mapping](https://docs.python.org/3/glossary.html#term-mapping)) – Passed verbatim to func after unpacking. xarray objects, if any, will not be subset to blocks. _Passing dask collections in kwargs is not allowed_.
>
> Is this the case for `apply_ufunc` as well?
test_future is not a dask collection. It's a distributed.Future, which points to an arbitrary, opaque data blob that xarray has no means to know about.
FWIW, I could reproduce the issue, where the future in the kwargs is not resolved to the data it points to as one would expect.
Minimal reproducer:
```python
import distributed
import xarray
client = distributed.Client(processes=False)
x = xarray.DataArray([1, 2]).chunk()
test_future = client.scatter(""Hello World"")
def f(d, test):
print(test)
return d
y = xarray.apply_ufunc(
f,
x,
dask='parallelized',
output_dtypes=""float64"",
kwargs={'test':test_future},
)
y.compute()
```
Expected print output: `Hello World`
Actual print output: `
`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148