home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "MEMBER", issue = 1307523148 and user = 6213168 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • crusaderky · 3 ✖

issue 1

  • Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future · 3 ✖

author_association 1

  • MEMBER · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1280764797 https://github.com/pydata/xarray/issues/6803#issuecomment-1280764797 https://api.github.com/repos/pydata/xarray/issues/6803 IC_kwDOAMm_X85MVut9 crusaderky 6213168 2022-10-17T12:15:36Z 2022-10-17T12:20:02Z MEMBER

python new_data_future = xr.apply_ufunc( _copy_test, data, a_x, ... ) instead of using kwargs.

I've opened https://github.com/dask/distributed/issues/7140 to simplify this. With it implemented, my snippet python test = np.full((20,), 30) a = da.from_array(test) dsk = client.scatter(dict(a.dask), broadcast=True) a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape) a_x = xarray.DataArray(a, dims=["new_z"]) would become python test = np.full((20,), 30) a_x = xarray.DataArray(test, dims=["new_z"]).chunk() a_x = client.scatter(a_x)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148
1280746923 https://github.com/pydata/xarray/issues/6803#issuecomment-1280746923 https://api.github.com/repos/pydata/xarray/issues/6803 IC_kwDOAMm_X85MVqWr crusaderky 6213168 2022-10-17T12:01:17Z 2022-10-17T12:01:17Z MEMBER

Having said the above, your design is... contrived.

There isn't, as of today, a straightforward way to scatter a local dask collection (persist() will push the whole thing through the scheduler and likely send it out of memory).

Workaround: python test = np.full((20,), 30) a = da.from_array(test) dsk = client.scatter(dict(a.dask), broadcast=True) a = da.Array(dsk, name=a.name, chunks=a.chunks, dtype=a.dtype, meta=a._meta, shape=a.shape) a_x = xarray.DataArray(a, dims=["new_z"]) Once you have a_x, you just pass it to the args (not kwargs) of apply_ufunc.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148
1280729879 https://github.com/pydata/xarray/issues/6803#issuecomment-1280729879 https://api.github.com/repos/pydata/xarray/issues/6803 IC_kwDOAMm_X85MVmMX crusaderky 6213168 2022-10-17T11:45:31Z 2022-10-17T11:45:31Z MEMBER

This is still an issue. I noticed that the documentation of map_blocks states: kwargs (mapping) – Passed verbatim to func after unpacking. xarray objects, if any, will not be subset to blocks. Passing dask collections in kwargs is not allowed.

Is this the case for apply_ufunc as well?

test_future is not a dask collection. It's a distributed.Future, which points to an arbitrary, opaque data blob that xarray has no means to know about.

FWIW, I could reproduce the issue, where the future in the kwargs is not resolved to the data it points to as one would expect. Minimal reproducer:

```python import distributed import xarray

client = distributed.Client(processes=False) x = xarray.DataArray([1, 2]).chunk() test_future = client.scatter("Hello World")

def f(d, test): print(test) return d

y = xarray.apply_ufunc( f, x, dask='parallelized', output_dtypes="float64", kwargs={'test':test_future}, ) y.compute() `` Expected print output:Hello WorldActual print output: <Future: finished, type: str, key: str-b012273bcde56eadf364cd3ce9b4ca26>`

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Passing a distributed.Future to the kwargs of apply_ufunc should resolve the future 1307523148

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 29.929ms · About: xarray-datasette