home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where author_association = "MEMBER", issue = 618081836 and user = 10194086 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • mathause · 17 ✖

issue 1

  • ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc` · 17 ✖

author_association 1

  • MEMBER · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
675890055 https://github.com/pydata/xarray/pull/4060#issuecomment-675890055 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY3NTg5MDA1NQ== mathause 10194086 2020-08-19T06:57:29Z 2020-08-19T06:57:29Z MEMBER

ok then - let's do this. Thanks a lot @kmuehlbauer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
675356719 https://github.com/pydata/xarray/pull/4060#issuecomment-675356719 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY3NTM1NjcxOQ== mathause 10194086 2020-08-18T09:02:58Z 2020-08-18T09:02:58Z MEMBER

Nice! Unless @dcherian has any additional comments I'll merge in a few days

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
669283654 https://github.com/pydata/xarray/pull/4060#issuecomment-669283654 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY2OTI4MzY1NA== mathause 10194086 2020-08-05T16:06:59Z 2020-08-05T16:06:59Z MEMBER

gentle ping @shoyer @dcherian I think this is not too far away

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
660874936 https://github.com/pydata/xarray/pull/4060#issuecomment-660874936 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY2MDg3NDkzNg== mathause 10194086 2020-07-20T08:11:37Z 2020-07-20T08:11:37Z MEMBER

I had another look and seems fine to me. So let's ping @shoyer & @dcherian for a review.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
651802870 https://github.com/pydata/xarray/pull/4060#issuecomment-651802870 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY1MTgwMjg3MA== mathause 10194086 2020-06-30T13:49:05Z 2020-06-30T13:49:05Z MEMBER

Could you use signature.all_output_core_dims ?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
651796020 https://github.com/pydata/xarray/pull/4060#issuecomment-651796020 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY1MTc5NjAyMA== mathause 10194086 2020-06-30T13:37:19Z 2020-06-30T13:37:19Z MEMBER

You could go for something like this:

```python output_sizes_renamed = dict() for key, value in output_sizes.items():

if key not in signature.output_core_dims:
    raise ValueError("dimension name in 'output_sizes' must correspond to output_core_dims")
output_sizes_renamed[signature.dims_map[key]] = value

```

Instead of python output_sizes = { signature.dims_map[key]: value for key, value in output_sizes.items() }

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
651704371 https://github.com/pydata/xarray/pull/4060#issuecomment-651704371 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY1MTcwNDM3MQ== mathause 10194086 2020-06-30T10:21:53Z 2020-06-30T10:21:53Z MEMBER

Ok, yes I am happy with erroring ourselves.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
651664318 https://github.com/pydata/xarray/pull/4060#issuecomment-651664318 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDY1MTY2NDMxOA== mathause 10194086 2020-06-30T09:06:40Z 2020-06-30T09:06:40Z MEMBER

Yes, I think this got lost in a resolved comment.

output_core_dims=[{"abc": 2}] looks very elegant. But it is passed to _UFuncSignature(input_core_dims, output_core_dims) so you would probably need to take it apart again in the code (or adapt _UFuncSignature to handle this case). Not sure how involved this is. Could potentially be left for a future PR?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
635273244 https://github.com/pydata/xarray/pull/4060#issuecomment-635273244 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzNTI3MzI0NA== mathause 10194086 2020-05-28T11:06:15Z 2020-05-28T11:22:18Z MEMBER

Could be "Avoid adjusting gufunc’s meta dtype twice" in https://docs.dask.org/en/latest/changelog.html#id84

Not sure if it is better to xfail or bump the version of dask (which could be done ipython ci/min_deps_check.py ci/requirements/py36-min-all-deps.yml)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
634587531 https://github.com/pydata/xarray/pull/4060#issuecomment-634587531 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzNDU4NzUzMQ== mathause 10194086 2020-05-27T11:03:12Z 2020-05-27T11:03:12Z MEMBER

No need to worry about the second one:

https://github.com/pydata/xarray/blob/e5cc19cd8f8a69e0743f230f5bf51b7778a0ff96/xarray/core/computation.py#L704

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
634572185 https://github.com/pydata/xarray/pull/4060#issuecomment-634572185 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzNDU3MjE4NQ== mathause 10194086 2020-05-27T10:28:26Z 2020-05-27T10:28:26Z MEMBER

output_dtypes: yes I think the docs should be updated

What I was trying to say - this is a change that can break existing code and it is not easy to do a proper deprecation cycle (i.e. only throw a warning). (I think allow_rechunk=True is not a good default.)


Ok an easier idea (could be brittle, though):

```python try: res = da.apply_gufunc(...)

catch error that was not raised in earlier versions of xr.apply_ufunc

except ValueError as e: if "with different chunksize present" in str(e): warnings.warn(str(e) + " This will throw an error in the future.") res = da.apply_gufunc(..., allow_rechunk=True) else: raise

``` However, I am only throwing around ideas - this is certainly not up to me. Let's see what the others think. Maybe they have an opinion or a better idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
631729847 https://github.com/pydata/xarray/pull/4060#issuecomment-631729847 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzMTcyOTg0Nw== mathause 10194086 2020-05-20T21:13:03Z 2020-05-20T21:13:03Z MEMBER

I don't see meta listed in the docs which is why it thought it's not needed. But if it is handled in dask.array.apply_gufunc it can of course stay.

I only realised the exact distinction between "allowed" and "parallelized" today - i.e. that "parallelized" is kind of the dask equivalent of np.vectorize. I can suggest something for the docstring (e.g. prefer "allowed" if ``func`` natively handles dask arrays or so)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
631700340 https://github.com/pydata/xarray/pull/4060#issuecomment-631700340 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzMTcwMDM0MA== mathause 10194086 2020-05-20T20:12:25Z 2020-05-20T20:12:25Z MEMBER

Maybe this is too defensive/surprising, and could be relaxed.

You would remove the daks="forbidden" branch and not the dask="parallelized"?

For the functions that don't handle dask arrays gracefully, dask="parallelized" would be the better option?


Very cool - good progress.

  • I guess you'll have to properly deprecate meta, something along the lines of: meta is no longer necessary and has no effect. it will be removed in a future version
  • I think it would be good to pass allow_rechunk.

I'll only be able to look at it properly next week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
631353813 https://github.com/pydata/xarray/pull/4060#issuecomment-631353813 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzMTM1MzgxMw== mathause 10194086 2020-05-20T09:20:54Z 2020-05-20T09:20:54Z MEMBER

Should you keep the dask="allowed" branch? That might solve some of the issues (instead of using allow_rechunk=True) . For example dask has it's own einsum implementation (used in xr.dot) so it may not be necessary to pipe that through dask.array.apply_gufunc (from the docstring: this function is like np.vectorize, but for the blocks of dask arrays). However, I am only speculating.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
630846826 https://github.com/pydata/xarray/pull/4060#issuecomment-630846826 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYzMDg0NjgyNg== mathause 10194086 2020-05-19T14:13:38Z 2020-05-19T14:13:38Z MEMBER

Thanks! That would be quite cool!

dask.array.apply_gufunc does some fancy stuff with np.vectorize (to determine the output_dtypes) so I would not vectorize the function ourselves. For the other I don't have a qualified opinion.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
629270796 https://github.com/pydata/xarray/pull/4060#issuecomment-629270796 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYyOTI3MDc5Ng== mathause 10194086 2020-05-15T14:36:56Z 2020-05-15T14:36:56Z MEMBER

Ah yes I see (https://github.com/pydata/xarray/issues/1815#issuecomment-440089606). dask.array.apply_gufunc should also be able to handle one output only.

A complete refactor of apply_ufunc would be quite some challenge.

Indeed - I think it could simplify _apply_blockwise (and might make the meta keword obsolete) but it would be good if someone with more experience of dask could weigh in.

@dcherian @shoyer

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836
629223049 https://github.com/pydata/xarray/pull/4060#issuecomment-629223049 https://api.github.com/repos/pydata/xarray/issues/4060 MDEyOklzc3VlQ29tbWVudDYyOTIyMzA0OQ== mathause 10194086 2020-05-15T13:02:03Z 2020-05-15T13:02:03Z MEMBER
  • It might be good to add a test with a reduction and one with vectorize=True.
  • Would it be possible to replace the call to dask.array.blockwise (for one output variable) with dask.array.apply_gufunc? Do you know why blockwise is used further below and not dask.array.apply_gufunc? I assume it's due to historical reasons but I am not sure.
  • dask.array.apply_gufunc does all sorts of stuff - e.g. infer meta. This could potentially solve #4015 (pull #4022) and simplify the call signature of apply_ufunc?

https://github.com/dask/dask/blob/3573b2ddca81aeb41a7def6dd4194020f853ab18/dask/array/gufunc.py#L175

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  ENH: use `dask.array.apply_gufunc` in `xr.apply_ufunc`  618081836

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 74.389ms · About: xarray-datasette