html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/6803#issuecomment-1280786780,https://api.github.com/repos/pydata/xarray/issues/6803,1280786780,IC_kwDOAMm_X85MV0Fc,33886395,2022-10-17T12:33:18Z,2022-10-17T12:33:18Z,NONE,"I will try that. I still find it weird that I need to wrap a numpy object into a task/xarray object to be able to send it to workers when there is dask.scatter made for exactly that purpose. Thanks for opening that issue. I do feel there is the need to revisit scatter functionality and role particularly around dynamic clusters. Having a better look at your initial comment, that may still work if you call `Future.result()` method inside the function applied. That in theory should retrieve the data associated with that Future, in that case ""Hello World"". However, in a dark gateway setup that will fail","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1280759221,https://api.github.com/repos/pydata/xarray/issues/6803,1280759221,IC_kwDOAMm_X85MVtW1,33886395,2022-10-17T12:11:05Z,2022-10-17T12:11:05Z,NONE,"I'm not sure I understand the code above. In my case I have an array of approximately 300k elements that each and every function call needs to have access. I can pass it as a kwargs in its numpy form, but once I scale up the calculation across a large dataset (many large chunks) such array gets replicated for every task pushing the scheduler out of memory. That is why I tried to send the dataset to the cluster beforehand using scatter, but I cannot resolve the Future at the workers","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1280743293,https://api.github.com/repos/pydata/xarray/issues/6803,1280743293,IC_kwDOAMm_X85MVpd9,33886395,2022-10-17T11:59:19Z,2022-10-17T11:59:19Z,NONE,"I can add that this problem is augmented in a dask_gateway system where the task just fails. With `apply_ufunc` I never received an error but in similar context I obtained something very similar to https://github.com/dask/dask-gateway/issues/404. My interpretation is that the Future is resolved at the worker (or in case of apply_ufunc a thread of this worker) and embeds a reference to the Client object. This last however uses a gateway connection that is not understood by the worker as generally is the scheduler dealing with those","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1264523142,https://api.github.com/repos/pydata/xarray/issues/6803,1264523142,IC_kwDOAMm_X85LXxeG,33886395,2022-10-02T01:29:35Z,2022-10-02T01:29:35Z,NONE,"I think I may have narrowed down the problem to a limitation in dask using dask_gateway. If passing a Future to a worker, the worker will try to unpickle that Future, and as part of that unpickle the Client object passed when creating such Future. Unfortunately, in a dask_gateway context the client is behind a `gateway` connection that is not understood by the worker as normally does not have to deal with a gateway at all. In my case I do not get any error message, just the task failing and retrying over and over, but fiddling around I managed to get the same error as this post (https://stackoverflow.com/questions/70775315/scattering-data-to-dask-cluster-workers-unknown-address-scheme-gateway)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148 https://github.com/pydata/xarray/issues/6803#issuecomment-1260319916,https://api.github.com/repos/pydata/xarray/issues/6803,1260319916,IC_kwDOAMm_X85LHvSs,33886395,2022-09-28T02:53:25Z,2022-09-28T02:53:25Z,NONE,"This is still an issue. I noticed that the documentation of `map_blocks` states: __kwargs__ ([mapping](https://docs.python.org/3/glossary.html#term-mapping)) – Passed verbatim to func after unpacking. xarray objects, if any, will not be subset to blocks. _Passing dask collections in kwargs is not allowed_. Is this the case for `apply_ufunc` as well? if yes than it is not documented. Is there another recommended way to pass data to workers without clogging the scheduler for this application?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1307523148