home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 420139027 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • shoyer 2
  • ttung 1
  • mrocklin 1
  • dcherian 1

author_association 2

  • MEMBER 4
  • CONTRIBUTOR 1

issue 1

  • can the callables of apply_ufunc + dask get a typed/labeled array · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
542257062 https://github.com/pydata/xarray/issues/2807#issuecomment-542257062 https://api.github.com/repos/pydata/xarray/issues/2807 MDEyOklzc3VlQ29tbWVudDU0MjI1NzA2Mg== dcherian 2448579 2019-10-15T15:01:08Z 2019-10-15T15:01:08Z MEMBER

@ttung xarray 0.14.0 has a new map_blocks function that does what you want. It currently has some limitations on what the user function can do.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can the callables of apply_ufunc + dask get a typed/labeled array 420139027
472145306 https://github.com/pydata/xarray/issues/2807#issuecomment-472145306 https://api.github.com/repos/pydata/xarray/issues/2807 MDEyOklzc3VlQ29tbWVudDQ3MjE0NTMwNg== shoyer 1217238 2019-03-12T19:21:26Z 2019-03-12T19:21:26Z MEMBER

I understand that there might be some challenges with returning xarray objects, but it seems like taking xarray objects should be very straightforward. Anything problematic about that?

This would probably be fine as an opt-in option. I'm a little worried that this would be a confusing model for users -- we don't have any other functions that work like this.

Maybe something similar would work here? Xarray would construct a dummy Xarray chunk, apply the user defined function onto that chunk, and then extrapolate metadata out from there somehow.

Yes, this is another possibility, though with xarray there is quite a bit of metadata to extrapolate! We do something similar already in groupby.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can the callables of apply_ufunc + dask get a typed/labeled array 420139027
472141327 https://github.com/pydata/xarray/issues/2807#issuecomment-472141327 https://api.github.com/repos/pydata/xarray/issues/2807 MDEyOklzc3VlQ29tbWVudDQ3MjE0MTMyNw== mrocklin 306380 2019-03-12T19:09:58Z 2019-03-12T19:09:58Z MEMBER

The challenge is that with dask's lazy evaluation, we don't know the structure of the returned objects until after evaluating the wrapped functions. So we can't rebuild xarray objects unless we require redundantly specify all the coordinates and attributes from the return values.

Typically in Dask we run the user defined function on an empty version of the data and hope that it provides an appropriately shaped output. If it fails during this process, we ask the user to provide sufficient information for us to populate metadata. Maybe something similar would work here? Xarray would construct a dummy Xarray chunk, apply the user defined function onto that chunk, and then extrapolate metadata out from there somehow.

I'm likely glossing over several important details, but hopefully the general gist of what I'm trying to convey above is somewhat sensible, even if not doable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can the callables of apply_ufunc + dask get a typed/labeled array 420139027
472139607 https://github.com/pydata/xarray/issues/2807#issuecomment-472139607 https://api.github.com/repos/pydata/xarray/issues/2807 MDEyOklzc3VlQ29tbWVudDQ3MjEzOTYwNw== ttung 280924 2019-03-12T19:04:59Z 2019-03-12T19:04:59Z CONTRIBUTOR

I understand that there might be some challenges with returning xarray objects, but it seems like taking xarray objects should be very straightforward. Anything problematic about that?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can the callables of apply_ufunc + dask get a typed/labeled array 420139027
472129555 https://github.com/pydata/xarray/issues/2807#issuecomment-472129555 https://api.github.com/repos/pydata/xarray/issues/2807 MDEyOklzc3VlQ29tbWVudDQ3MjEyOTU1NQ== shoyer 1217238 2019-03-12T18:37:03Z 2019-03-12T18:37:03Z MEMBER

In the first version of apply_ufunc, I experimented with applying functions that take and return xarray objects.

The challenge is that with dask's lazy evaluation, we don't know the structure of the returned objects until after evaluating the wrapped functions. So we can't rebuild xarray objects unless we require redundantly specify all the coordinates and attributes from the return values.

The alternative would be to make a parallel but eagerly evaluated version of apply_ufunc, e.g., by calling compute on each chunk and then reassembling the result (e.g., ttps://github.com/pydata/xarray/pull/2616). The downside is that this will load your data into memory, but maybe that's acceptable for your purposes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  can the callables of apply_ufunc + dask get a typed/labeled array 420139027

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.896ms · About: xarray-datasette