home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where author_association = "MEMBER" and issue = 205455788 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • max-sixty 6
  • shoyer 4
  • dcherian 2
  • benbovy 1

issue 1

  • Consistent naming for xarray's methods that apply functions · 13 ✖

author_association 1

  • MEMBER · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1111431038 https://github.com/pydata/xarray/issues/1251#issuecomment-1111431038 https://api.github.com/repos/pydata/xarray/issues/1251 IC_kwDOAMm_X85CPxd- max-sixty 5635139 2022-04-27T20:06:24Z 2022-04-27T20:06:24Z MEMBER

FYI the apply / map change went in a couple of years ago.

We still don't have an apply_raw. I think it probably makes sense to consolidate this with https://github.com/pydata/xarray/issues/1618, so I'll close this, even though it has some good discussion.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
550046990 https://github.com/pydata/xarray/issues/1251#issuecomment-550046990 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDU1MDA0Njk5MA== dcherian 2448579 2019-11-05T22:14:49Z 2019-11-05T22:14:49Z MEMBER

@max-sixty I like your proposal!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
550037494 https://github.com/pydata/xarray/issues/1251#issuecomment-550037494 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDU1MDAzNzQ5NA== shoyer 1217238 2019-11-05T21:50:01Z 2019-11-05T21:50:01Z MEMBER

I more strongly think that apply is a confusing and non-standard term for "run this function on each item in this container", even if it's pandas' name.

This is a fair point. map is definitely the standard name in the context of "map reduce" type operations.

What are your thoughts re:

  • Adding map as the documented approach for run-on-each on Dataset & GroupBy
  • Adding apply_raw (or similar) as a new function that runs functions on the 'raw' arrays
  • Keeping apply around for backward-compat, similar to the drop case

I would support this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
550035303 https://github.com/pydata/xarray/issues/1251#issuecomment-550035303 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDU1MDAzNTMwMw== max-sixty 5635139 2019-11-05T21:44:44Z 2019-11-05T21:44:44Z MEMBER

OK. Does that inform your view on map vs apply?

I more strongly think that apply is a confusing and non-standard term for "run this function on each item in this container", even if it's pandas' name.

I'm keener to offer map an as option that necessarily reusing apply.

What are your thoughts re: - Adding map as the documented approach for run-on-each on Dataset & GroupBy - Adding apply_raw (or similar) as a new function that runs functions on the 'raw' arrays - Keeping apply around for backward-compat, similar to the drop case

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
549627167 https://github.com/pydata/xarray/issues/1251#issuecomment-549627167 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDU0OTYyNzE2Nw== shoyer 1217238 2019-11-05T01:48:30Z 2019-11-05T01:48:30Z MEMBER

+@dcherian

@max-sixty thanks for pushing this along!

I think I'm coming to appreciate backwards compatibility as an important consideration more and more these days. It's just really painful to reuse methods for something entirely different.

This makes me lean towards adding separate apply_raw() methods. The name is definitely less memorable than apply, but on the other hand it is also definitely easier to guess the difference between apply/apply_raw then apply/map.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
547592571 https://github.com/pydata/xarray/issues/1251#issuecomment-547592571 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDU0NzU5MjU3MQ== max-sixty 5635139 2019-10-29T19:32:14Z 2019-10-30T00:22:56Z MEMBER

I put the change for Dataset.apply -> Dataset.map in. Should we do the same for GroupBy?

I think those are probably the two easiest decisions to make (and hopefully will kick off moving this issue forwards)

Edit: the reason I hesitated for GroupBy is that it's not exactly the same: the object returned isn't the same (i.e. Dataset.map returns a Dataset, while GroupBy.map would return a Dataset)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
456195815 https://github.com/pydata/xarray/issues/1251#issuecomment-456195815 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDQ1NjE5NTgxNQ== shoyer 1217238 2019-01-21T20:52:07Z 2019-01-21T20:52:07Z MEMBER

I don't think we should consider ourselves beholden to pandas's bad names, but we should definitely try to preserve backwards compatibility and interpretability for users.

Going back to Python itself: - apply(func, args, kwargs) (from Python 2.x) is equivalent to func(*args, **kwargs) - map() maps a function over each element of an iterable - functools.reduce() applies a binary function repeatedly to convert an iterable into a single element

For xarray, we need: 1. a method for wrapping functions that work on unlabeled arrays 2. a method for mapping functions over each element of a Dataset or grouped object. 3. (possibly) a method for wrapping aggregation functions that act on unlabeled arrays

Currently, we call both (1) and (2) apply(), which is pretty confusing, and use reduce() for (3) even though it could potentially be a special case of (1) with a bit of extra magic and is quite unlike functools.reduce. In contrast, pandas calls both (1) and (2) apply() (using raw=True/raw=False to distinguish), and calls (3) aggregate or agg.

So long term, it could make sense to rename the current Dataset.apply()/GroupBy.apply() (case 2) to .map, and also rename .reduce() to the more generic .aggregate().

That said, I'm trying to imagine what the transition process for switching to new behavior for Dataset.apply looks like. We already will re-add dimensions to the output from calling functions in apply(), but at some point we have to a do a hard cut-off from passing DataArray objects to the function in apply to passing in a raw array.

I suppose we could do this by adding a raw keyword-only argument to .apply(): - If raw=False (current default), we would raise a warning about changing behavior and would pass-on DataArray objects to the applied function. Users would be encouraged to use .map() instead. - If raw=True (future default behavior), we would pass in raw numpy/dask arrays to the future function. - The dim argument might only be supported with raw=True.

We would end up with an extra extraneous raw argument, which we could remove/deprecate at our leisure.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
455353612 https://github.com/pydata/xarray/issues/1251#issuecomment-455353612 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDQ1NTM1MzYxMg== max-sixty 5635139 2019-01-17T22:21:06Z 2019-01-17T22:21:06Z MEMBER

I would like to rename rolling.reduce to rolling.apply to be consistent with pandas & groupby

+0.5 if map fails

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
455350875 https://github.com/pydata/xarray/issues/1251#issuecomment-455350875 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDQ1NTM1MDg3NQ== dcherian 2448579 2019-01-17T22:10:44Z 2019-01-17T22:10:44Z MEMBER

One proposal: rename apply to map

-1 for pandas incompatibility.

I would like to rename rolling.reduce to rolling.apply to be consistent with pandas & groupby

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
455349535 https://github.com/pydata/xarray/issues/1251#issuecomment-455349535 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDQ1NTM0OTUzNQ== max-sixty 5635139 2019-01-17T22:06:05Z 2019-01-17T22:06:05Z MEMBER

One proposal: rename apply to map

Would we accept this? I'd be up for doing the PR to deprecate apply and introduce map. It makes a Dataset more consistent with a standard mapping interface. But it would be inconsistent with pandas and a rename of fairly widely used method

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
277942376 https://github.com/pydata/xarray/issues/1251#issuecomment-277942376 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDI3Nzk0MjM3Ng== benbovy 4160723 2017-02-07T09:16:34Z 2017-02-07T09:16:34Z MEMBER

I would be +1 for apply and apply_raw.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
277888535 https://github.com/pydata/xarray/issues/1251#issuecomment-277888535 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDI3Nzg4ODUzNQ== shoyer 1217238 2017-02-07T03:08:41Z 2017-02-07T03:08:41Z MEMBER

Another option is to keep apply as-is for Dataset and GroupBy objects, but add a separate apply_raw method for applying functions that act on "raw" arrays. This would be a little more similar to pandas' apply with raw=True.

We could even do the raw=True keyword argument like pandas, but this is a little awkward because there are some additional arguments on apply_raw that don't make sense on apply (e.g., arguments that specify that some dimensions should be dropped or added).

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788
277576392 https://github.com/pydata/xarray/issues/1251#issuecomment-277576392 https://api.github.com/repos/pydata/xarray/issues/1251 MDEyOklzc3VlQ29tbWVudDI3NzU3NjM5Mg== max-sixty 5635139 2017-02-06T03:01:06Z 2017-02-06T03:01:06Z MEMBER

Sounds good. It breaks the consistency with pandas' apply, but map is much more logical

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Consistent naming for xarray's methods that apply functions 205455788

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.813ms · About: xarray-datasette