home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where issue = 457716471 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • max-sixty 5
  • shoyer 3
  • OriolAbril 3

author_association 2

  • MEMBER 8
  • CONTRIBUTOR 3

issue 1

  • apply_ufunc should preemptively broadcast · 11 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
503768609 https://github.com/pydata/xarray/issues/3032#issuecomment-503768609 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzc2ODYwOQ== OriolAbril 23738400 2019-06-19T22:23:21Z 2019-06-19T22:23:21Z CONTRIBUTOR

@max-sixty Not at all, whatever is best. I actually opened the issue without being 100% it was one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503747413 https://github.com/pydata/xarray/issues/3032#issuecomment-503747413 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzc0NzQxMw== max-sixty 5635139 2019-06-19T21:09:18Z 2019-06-19T21:09:18Z MEMBER

@shoyer thanks for the clarity

@OriolAbril would you mind if we changed this issue to "apply_ufunc should preemptively broadcast"

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503746105 https://github.com/pydata/xarray/issues/3032#issuecomment-503746105 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzc0NjEwNQ== shoyer 1217238 2019-06-19T21:04:56Z 2019-06-19T21:04:56Z MEMBER

With NumPy arrays at least, there is no cost for broadcasting, because it can always be done with views. But even for other array types, inserting size 1 dimensions in the correct location should be basically free, and would be more helpful than what we currently do

On Wed, Jun 19, 2019 at 9:25 PM Oriol Abril notifications@github.com wrote:

I'm trying to think whether there would be any performance cost there - i.e. are there any arrays where preemptive broadcasting would be both expensive and unnecessary?

Even if there were a performance cost (compared to the actual behaviour), it could be easily avoided by using all dims as input_core_dims couldn't it? IIUC, all dims should be broadcasted unless they are in input core dims, so it broadcasting could still be avoided without problem.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3032?email_source=notifications&email_token=AAJJFVUDJUKKVSXB5DQONF3P3J2YXA5CNFSM4HZEHCZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYCXX4Y#issuecomment-503675891, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJJFVTEQBAYTYROYMQNGC3P3J2YXANCNFSM4HZEHCZA .

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503675891 https://github.com/pydata/xarray/issues/3032#issuecomment-503675891 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzY3NTg5MQ== OriolAbril 23738400 2019-06-19T18:25:14Z 2019-06-19T18:25:14Z CONTRIBUTOR

I'm trying to think whether there would be any performance cost there - i.e. are there any arrays where preemptive broadcasting would be both expensive and unnecessary?

Even if there were a performance cost (compared to the actual behaviour), it could be easily avoided by using all dims as input_core_dims couldn't it? IIUC, all dims should be broadcasted unless they are in input core dims, so it broadcasting could still be avoided without problem.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503652824 https://github.com/pydata/xarray/issues/3032#issuecomment-503652824 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzY1MjgyNA== max-sixty 5635139 2019-06-19T17:21:34Z 2019-06-19T17:21:34Z MEMBER

I'm trying to think whether there would be any performance cost there - i.e. are there any arrays where preemptive broadcasting would be both expensive and unnecessary?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503650578 https://github.com/pydata/xarray/issues/3032#issuecomment-503650578 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzY1MDU3OA== shoyer 1217238 2019-06-19T17:15:32Z 2019-06-19T17:15:32Z MEMBER

Yes, exactly.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503625092 https://github.com/pydata/xarray/issues/3032#issuecomment-503625092 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzYyNTA5Mg== max-sixty 5635139 2019-06-19T16:04:58Z 2019-06-19T16:04:58Z MEMBER

We should probably make apply_ufunc() explicitly broadcast arrays first.

To confirm, so that we have something like this?

```python xr.apply_ufunc(func, a, c)

Out

(7, 3, 5, 6)

(7, 3, 5, 6)

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503618955 https://github.com/pydata/xarray/issues/3032#issuecomment-503618955 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzYxODk1NQ== shoyer 1217238 2019-06-19T15:50:03Z 2019-06-19T15:50:03Z MEMBER

For what it's worth, I agree that this behavior is a little surprising. We should probably make apply_ufunc() explicitly broadcast arrays first.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503337637 https://github.com/pydata/xarray/issues/3032#issuecomment-503337637 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzMzNzYzNw== max-sixty 5635139 2019-06-18T22:37:42Z 2019-06-18T22:42:14Z MEMBER

Because func receives unlabelled arrays, apply_ufunc aligns the axes order. So if they share a dimension:

```python In [8]: import xarray as xr ...: import numpy as np ...: ...: a = xr.DataArray(data=np.random.normal(size=(7, 3)), dims=["dim1", "dim2"]) ...: c = xr.DataArray(data=np.random.normal(size=(7, 6)), dims=["dim1", "dim4"]) # <- change here ...: ...: def func(x,y): ...: print(x.shape) ...: print(y.shape) ...: return x ...:

In [9]: xr.apply_ufunc(func, a, c) (7, 3, 1) (7, 1, 6) ```

...otherwise func wouldn't know how to align.

Another option would be for your original example to put lengths of 1 in all axes, rather than only 'forward filling', e.g.

``` xr.apply_ufunc(func, a, c)

Out

(7, 3, 1, 1)

(1, 1, 5, 6) # <- change here

```

I think it operates without that step because functions 'in the wild' generally will handle that themselves, but that's a guess and needs someone who knows this better to weight in

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503335417 https://github.com/pydata/xarray/issues/3032#issuecomment-503335417 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzMzNTQxNw== OriolAbril 23738400 2019-06-18T22:28:20Z 2019-06-18T22:28:20Z CONTRIBUTOR

Then shouldn't a in the first example keep its original shape?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471
503334445 https://github.com/pydata/xarray/issues/3032#issuecomment-503334445 https://api.github.com/repos/pydata/xarray/issues/3032 MDEyOklzc3VlQ29tbWVudDUwMzMzNDQ0NQ== max-sixty 5635139 2019-06-18T22:24:16Z 2019-06-18T22:24:16Z MEMBER

Thanks for the issue & code sample @OriolAbril

IIUC, func needs to do this broadcasting itself; from the apply_ufunc docstring:

func : callable Function to call like ``func(*args, **kwargs)`` on unlabeled arrays (``.data``) that returns an array or tuple of arrays. If multiple arguments with non-matching dimensions are supplied, this function is expected to vectorize (broadcast) over axes of positional arguments in the style of NumPy universal functions [1]_ (if this is not the case, set ``vectorize=True``). If this function returns multiple outputs, you must set ``output_core_dims`` as well.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc should preemptively broadcast 457716471

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.768ms · About: xarray-datasette