home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 528701910 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • dcherian 3
  • shoyer 1

issue 1

  • apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
567077240 https://github.com/pydata/xarray/issues/3574#issuecomment-567077240 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NzA3NzI0MA== dcherian 2448579 2019-12-18T15:21:19Z 2019-12-18T15:21:19Z MEMBER

Right the xarray solution is to set meta = np.ndarray if vectorize is True else None if the user doesn't explicitly provide meta. Or am I missing something?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
566640524 https://github.com/pydata/xarray/issues/3574#issuecomment-566640524 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NjY0MDUyNA== dcherian 2448579 2019-12-17T16:29:35Z 2019-12-17T16:29:35Z MEMBER

meta should be passed to blockwise through _apply_blockwise with default None (I think) and np.ndarray if vectorize is True. You'll have to pass the vectorize kwarg down to this level I think.

https://github.com/pydata/xarray/blob/6ad59b93f814b48053b1a9eea61d7c43517105cb/xarray/core/computation.py#L579-L593

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565194778 https://github.com/pydata/xarray/issues/3574#issuecomment-565194778 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NTE5NDc3OA== dcherian 2448579 2019-12-12T21:28:39Z 2019-12-12T21:28:39Z MEMBER

@shoyer's option 1 should be a relatively simple xarray PR is one of you is up for it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565107345 https://github.com/pydata/xarray/issues/3574#issuecomment-565107345 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NTEwNzM0NQ== shoyer 1217238 2019-12-12T17:33:43Z 2019-12-12T17:33:43Z MEMBER

The problem is that Dask, as of version 2.0, calls functions applied to dask arrays with size zero inputs, to figure out the output array type, e.g., is the output a dense numpy.ndarray or a sparse array?

Unfortunately, numpy.vectorize doesn't know how to large of a size 0 array to make, because it doesn't have anything like the output_sizes argument.

For xarray, we have a couple of options: 1. we can safely assume that if the applied function is a np.vectorize, then it should pass meta=np.ndarray into the relevant dask functions (e.g., dask.array.blockwise). This should avoid the need to evaluate with size 0 arrays. 1. we could add an output_sizes argument to np.vectorize either upstream in NumPy or into a wrapper in Xarray.

(1) is probably easiest here.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 11.843ms · About: xarray-datasette