issue_comments
8 rows where issue = 1307523148 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future · 8 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1280786780 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280786780 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MV0Fc | alessioarena 33886395 | 2022-10-17T12:33:18Z | 2022-10-17T12:33:18Z | NONE | I will try that. I still find it weird that I need to wrap a numpy object into a task/xarray object to be able to send it to workers when there is dask.scatter made for exactly that purpose. Thanks for opening that issue. I do feel there is the need to revisit scatter functionality and role particularly around dynamic clusters. Having a better look at your initial comment, that may still work if you call |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1280764797 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280764797 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MVut9 | crusaderky 6213168 | 2022-10-17T12:15:36Z | 2022-10-17T12:20:02Z | MEMBER |
I've opened https://github.com/dask/distributed/issues/7140 to simplify this. With it implemented, my snippet
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1280759221 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280759221 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MVtW1 | alessioarena 33886395 | 2022-10-17T12:11:05Z | 2022-10-17T12:11:05Z | NONE | I'm not sure I understand the code above. In my case I have an array of approximately 300k elements that each and every function call needs to have access. I can pass it as a kwargs in its numpy form, but once I scale up the calculation across a large dataset (many large chunks) such array gets replicated for every task pushing the scheduler out of memory. That is why I tried to send the dataset to the cluster beforehand using scatter, but I cannot resolve the Future at the workers |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1280746923 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280746923 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MVqWr | crusaderky 6213168 | 2022-10-17T12:01:17Z | 2022-10-17T12:01:17Z | MEMBER | Having said the above, your design is... contrived. There isn't, as of today, a straightforward way to scatter a local dask collection ( Workaround:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1280743293 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280743293 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MVpd9 | alessioarena 33886395 | 2022-10-17T11:59:19Z | 2022-10-17T11:59:19Z | NONE | I can add that this problem is augmented in a dask_gateway system where the task just fails. With My interpretation is that the Future is resolved at the worker (or in case of apply_ufunc a thread of this worker) and embeds a reference to the Client object. This last however uses a gateway connection that is not understood by the worker as generally is the scheduler dealing with those |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1280729879 | https://github.com/pydata/xarray/issues/6803#issuecomment-1280729879 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85MVmMX | crusaderky 6213168 | 2022-10-17T11:45:31Z | 2022-10-17T11:45:31Z | MEMBER |
test_future is not a dask collection. It's a distributed.Future, which points to an arbitrary, opaque data blob that xarray has no means to know about. FWIW, I could reproduce the issue, where the future in the kwargs is not resolved to the data it points to as one would expect. Minimal reproducer: ```python import distributed import xarray client = distributed.Client(processes=False) x = xarray.DataArray([1, 2]).chunk() test_future = client.scatter("Hello World") def f(d, test): print(test) return d y = xarray.apply_ufunc(
f,
x,
dask='parallelized',
output_dtypes="float64",
kwargs={'test':test_future},
)
y.compute()
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1264523142 | https://github.com/pydata/xarray/issues/6803#issuecomment-1264523142 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85LXxeG | alessioarena 33886395 | 2022-10-02T01:29:35Z | 2022-10-02T01:29:35Z | NONE | I think I may have narrowed down the problem to a limitation in dask using dask_gateway. If passing a Future to a worker, the worker will try to unpickle that Future, and as part of that unpickle the Client object passed when creating such Future. Unfortunately, in a dask_gateway context the client is behind a |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 | |
1260319916 | https://github.com/pydata/xarray/issues/6803#issuecomment-1260319916 | https://api.github.com/repos/pydata/xarray/issues/6803 | IC_kwDOAMm_X85LHvSs | alessioarena 33886395 | 2022-09-28T02:53:25Z | 2022-09-28T02:53:25Z | NONE | This is still an issue.
I noticed that the documentation of Is this the case for |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 2