home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

14 rows where author_association = "MEMBER" and issue = 594594646 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • shoyer 5
  • kmuehlbauer 3
  • keewis 3
  • TomNicholas 2
  • dcherian 1

issue 1

  • Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods · 14 ✖

author_association 1

  • MEMBER · 14 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
651317938 https://github.com/pydata/xarray/pull/3936#issuecomment-651317938 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDY1MTMxNzkzOA== dcherian 2448579 2020-06-29T19:36:28Z 2020-06-29T19:36:28Z MEMBER

Thanks @johnomotani . This is a significant contribution.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
650297994 https://github.com/pydata/xarray/pull/3936#issuecomment-650297994 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDY1MDI5Nzk5NA== keewis 14808389 2020-06-26T17:19:34Z 2020-06-26T17:19:34Z MEMBER

well, we'd need to somehow be able to use sel to index every variable in a Dataset differently. While that would probably be possible it would make sel way too complicated, so I agree that an error would be good here.

I think no further changes to the pint tests should be required (except refactoring, but I can do that in a new PR after this one has been merged).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
650221609 https://github.com/pydata/xarray/pull/3936#issuecomment-650221609 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDY1MDIyMTYwOQ== keewis 14808389 2020-06-26T14:51:51Z 2020-06-26T14:51:51Z MEMBER

I noticed a few issues while debugging: - calling on a array with missing values raises ValueError: All-NaN slice encountered (try ds = xr.tutorial.open_dataset("rasm"); ds.argmin(dim="y")) - ds.argmin(dim=...) returns a single index (the same result as ds.argmin()) but ds.Tair.argmin(dim=...) returns something different from ds.Tair.argmin(). Is that intentional?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
649204113 https://github.com/pydata/xarray/pull/3936#issuecomment-649204113 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDY0OTIwNDExMw== shoyer 1217238 2020-06-25T04:10:50Z 2020-06-25T04:10:50Z MEMBER

I'm going to wait a little while before merging in case anyone else has comment, but otherwise will merge in a day or two (definitely before the next release)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
617381475 https://github.com/pydata/xarray/pull/3936#issuecomment-617381475 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYxNzM4MTQ3NQ== keewis 14808389 2020-04-21T19:57:27Z 2020-04-21T20:48:41Z MEMBER

you could mark the function("argmin") / function("argmax") with xfail (or skip?). We can decide on what to do with them when we close #3917

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
617002459 https://github.com/pydata/xarray/pull/3936#issuecomment-617002459 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYxNzAwMjQ1OQ== shoyer 1217238 2020-04-21T07:23:11Z 2020-04-21T07:23:27Z MEMBER

I notice that some of your new deprecation warnings get issued when running your tests. Please silence these methods, e.g., using pytest.mark.filterwarnings.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
616998622 https://github.com/pydata/xarray/pull/3936#issuecomment-616998622 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYxNjk5ODYyMg== shoyer 1217238 2020-04-21T07:14:10Z 2020-04-21T07:14:10Z MEMBER

I eventually found that the cause of the errors I was getting was that the argmin and argmax methods did not have an out argument. In order for the methods to be wrapped by numpy (and I guess dask is the same), the call signature of np.argmin() has to be supported, which means axis and out arguments are needed.

I guess this would happen if somebody calls np.argmin(xarray_object)? Does this happen from inside xarray somewhere, or in external code?

If it's the former, then we should just fix xarray not to do this. If it's only in external code, I would still consider breaking this behavior.

The override mechanism that NumPy uses in argmin/argmax is really old/fragile, and it's not something we ever intended to support. There's no way we could write an argmin/argmax method with an incompatible signature that would work with these functions.

The better way to do overrides in NumPy is probably with __array_function__ now, but I'm not sure we want the trouble...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
610516784 https://github.com/pydata/xarray/pull/3936#issuecomment-610516784 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYxMDUxNjc4NA== shoyer 1217238 2020-04-07T17:23:24Z 2020-04-07T17:23:24Z MEMBER

Also, feel free to rewrite to avoid using inject_all_ops_and_reduce_methods() for argmin/argmax at all. Method injection is pretty hacky, and generally not worthwhile, e.g., see this note about removing it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609607223 https://github.com/pydata/xarray/pull/3936#issuecomment-609607223 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTYwNzIyMw== kmuehlbauer 5821660 2020-04-06T07:13:00Z 2020-04-06T07:13:00Z MEMBER

@johnomotani FYI: For #3871 (merged) there is #3922 (yet unmerged) to fix dask-handling.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609605888 https://github.com/pydata/xarray/pull/3936#issuecomment-609605888 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTYwNTg4OA== kmuehlbauer 5821660 2020-04-06T07:09:45Z 2020-04-06T07:09:45Z MEMBER

reduce over all given dimensions?

By this do you mean find the minimum as if the array were first (partially or totally) flattened along the given dims somehow? I'm not sure we provide that kind of behaviour anywhere in the current API.

@TomNicholas Probably I was bit confused by the current docstring. I think I understand now and there should be no problem at all.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609598095 https://github.com/pydata/xarray/pull/3936#issuecomment-609598095 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTU5ODA5NQ== TomNicholas 35968931 2020-04-06T06:49:18Z 2020-04-06T06:49:18Z MEMBER

how should we discern (at least for argmin/argmax), if the user wants to...

That's a good question @kmuehlbauer, and the distinction probably needs to be clearer in the docs in general.

reduce over all given dimensions?

By this do you mean find the minimum as if the array were first (partially or totally) flattened along the given dims somehow? I'm not sure we provide that kind of behaviour anywhere in the current API.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609591692 https://github.com/pydata/xarray/pull/3936#issuecomment-609591692 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTU5MTY5Mg== kmuehlbauer 5821660 2020-04-06T06:31:12Z 2020-04-06T06:31:22Z MEMBER

+1 for making argmin/argmax (and idxmin/idxmax) work, when given multiple dimensions.

According to the current docstring it should already work that way for argmin/'argmax`:

Reduce this DataArray’s data by applying argmin along some dimension(s).

Returns: New DataArray/Dataset object with argmin applied to it's data and the indicated dimension(s) removed

But this behaviour is broken currently (works only for one given dim).

My main concern for changing the API as suggested above is, how should we discern (at least for argmin/argmax), if the user want's to:

  1. reduce over all given dimensions? (current behaviour according to docstring)
  2. reduce along all given dimensions? (suggested behaviour by @johnomotani [please correct if I'm wrong])
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609488771 https://github.com/pydata/xarray/pull/3936#issuecomment-609488771 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTQ4ODc3MQ== TomNicholas 35968931 2020-04-05T21:39:19Z 2020-04-05T21:39:19Z MEMBER

Another option would be to overload argmin

+1 for overloading argmin (and later idxmin). IMO we should never have one function for a 1D operation and one for an N-D operation if we can avoid it, everything should be N-dimensional.

I also really like how neat this resultant property is python da.isel(da.argmin(list_of_dim)) == da.min(list_of_dim) we could even use a hypothesis test to check it...

I think returning a dict of indices would be much more useful, but it does change existing behaviour (more useful because you can then do da.isel(da.argmin())).

Although it's breaking and would require a deprecation cycle, I think this is what we should aim for.

there's a not-very-helpful exception

Yes let's take the time to make that clearer for users - this will be a commonly-used function.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646
609468040 https://github.com/pydata/xarray/pull/3936#issuecomment-609468040 https://api.github.com/repos/pydata/xarray/issues/3936 MDEyOklzc3VlQ29tbWVudDYwOTQ2ODA0MA== shoyer 1217238 2020-04-05T19:19:41Z 2020-04-05T19:19:41Z MEMBER

This looks really comprehensive, thank you!

Before doing a really careful review here, I'd like to try to work out the full API design we want. I'll write out some of my thoughts here, but your thoughts would also be very welcome!

Here's my summary of the current situation: 1. We have two pairs of methods, argmin/argmax and the new idxmin/idxmax for calculating integer positions or coordinate labels of the minimum/maximum along one dimension. 2. We would like to have a way to do a similar calculation in multiple dimensions, which should return a dictionary of DataArray objects suitable for feeding into isel/sel.

This PR implements the multidimensional equivalent of argmin/argmax. It's fine to save the multidimensional equivalent of idxmin/idxmax for later, but we'll want to copy whatever design we pick here.

My first concern is about the name: it isn't obvious to me whether indices_min refers to integer positions or coordinate labels. I think a name based on argmin/argmax could help, e.g., perhaps argmin_nd or argmin_dict.

Another option would be to overload argmin/argmax to also support return a dictionary, depending on the type of the arguments. If you pass a list of dimensions over which to calculate the min or max locations, you would get a dictionary back: - array.argmin('x') would return a single DataArray - array.argmin(['x', 'y']) would return a dict of DataArray objects with keys {'x', 'y'} - array.argmin(['x']) would also return a dict {'x': ...}

I think I like this last option best but I would be curious what others think! @pydata/xarray any thoughts on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods 594594646

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 637.061ms · About: xarray-datasette