github: issue_comments: 11 rows where issue = 224878728 sorted by updated

11 rows where issue = 224878728 sorted by updated_at descending

Search:

✖

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
656361865	https://github.com/pydata/xarray/issues/1388#issuecomment-656361865	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDY1NjM2MTg2NQ==	dcherian 2448579	2020-07-09T21:32:52Z	2020-07-09T21:32:52Z	MEMBER	I think this is fixed now thanks to @johnomotani	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
571137508	https://github.com/pydata/xarray/issues/1388#issuecomment-571137508	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDU3MTEzNzUwOA==	stale[bot] 26384082	2020-01-06T13:26:42Z	2020-01-06T13:26:42Z	NONE	In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
363072151	https://github.com/pydata/xarray/issues/1388#issuecomment-363072151	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDM2MzA3MjE1MQ==	gajomi 244887	2018-02-05T12:35:10Z	2018-02-05T12:35:10Z	CONTRIBUTOR	@fujiisoup and @shoyer Really enlightening comments above. I think I am starting to get the dao of xarray a bit better :) I was thinking whether such aggregation methods (including argmin) should propagate the coordinate. Agreed it would be nice to have a consistent and well reasoned rule for coordinate propagation in aggregation methods. I think a key point here, which gets brought up in your example is that it might make sense to have different subrules depending on the semantics of the operation. Functions like `argmax` are explicitly concerned with underlying "indices" (dimensions or otherwise) and so may call for different behavior from the `mean`, which explicitly is explicitly invariant under permutations of the underlying indices. The `max`/`min`/`median` functions are interesting case to think about, in that they are also invariant with under change of underlying indices, but can have potentially more than one index that they are associated with and do not "destroy" information about the value at those indices. My concern with adding an additional dimension is that it is always a little surprising and error-prone when we invent new dimension names not supplied by the user (for example, this can lead to conflicting names) Yeah, I felt a little dirty appending '_argmax'. I think my favorite option is (2) with da.argmin_indices() returning a Dataset, which will allow da[da.argmin_indices()] OK. I think I understand now why @fujiisoup proposed output a Dataset rather than an array. That's a natural syntax for getting the values from the indices. Either way, I would like a separate dedicated method for returning multiple indexing arrays. +1 to dedicated adding more methods if needed, since I think even if it isn;t needed the associated docs will need to make sure users are aware of the analogous `idx*` methods if they get added.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
362951129	https://github.com/pydata/xarray/issues/1388#issuecomment-362951129	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDM2Mjk1MTEyOQ==	shoyer 1217238	2018-02-04T23:54:10Z	2018-02-04T23:54:10Z	MEMBER	I think it would be fine to add a special case to preserve coordinates corresponding to min/max values with xarray's `min()` and `max()` methods, but I don't feel strongly about this. The exact coordinates could be surprising if there are multiple min/max values. I agree that it does not make sense to preserve coordinates along aggregated dimensions for argmin/argmax, but we can preserve other coordinates. So I like @fujiisoup's example behavior above. I suppose we now have two candidate APIs for returning multiple indices from a method like argmin/argmax: 1. Add an additional dimension, e.g., `argmaxdim` for keeping track of multiple indices. 2. Return a dict or Dataset with multiple indices. I think my favorite option is (2) with `da.argmin_indices()` returning a Dataset, which will allow `da[da.argmin_indices()]` to work after we finish switching `__iter__` to only iterate over data variables (https://github.com/pydata/xarray/issues/884#issuecomment-338445322). One downside of this approach is that it is not obvious how we woudl define `argmin_indices()` to work on a Dataset, but that's probably OK given that you cannot use a Dataset (yet) for indexing. My concern with adding an additional dimension is that it is always a little surprising and error-prone when we invent new dimension names not supplied by the user (for example, this can lead to conflicting names). Also, consolidating indices will not work as well with `idxmin()`, which might put indices of different dtypes in the same array. Either way, I would like a separate dedicated method for returning multiple indexing arrays. It's convenient (and what users expect) for argmax to return a single array if taking the max only over one dimension. However, if we switch to add an `argmaxdim` or return a dict/Dataset for multiple dimensions, then we will end up with an annoying inconsistency between the 1D and N-D versions. It would be better to say `argmax(dim)` is only for one dimension (and raise an error if this is not true) and have the separate `argmax_indices(dims)` that is consistently defined for any number of dimensions.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
362902669	https://github.com/pydata/xarray/issues/1388#issuecomment-362902669	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDM2MjkwMjY2OQ==	fujiisoup 6815844	2018-02-04T12:20:33Z	2018-02-04T12:52:29Z	MEMBER	@gajomi Sorry for my late response and thank you for the proposal. But aside from my previous proposal, I was thinking whether such aggregation methods (including `argmin`) should propagate the coordinate. For example, as you pointed out, in theory, we may be able to track `x`-coordinate at the argmin index after `da.argmin(dim='x')`. But it is not reasonable for `da.mean(dim='x')`. It may be reasonable for `da.max(dim='x')` but not for `da.median(dim='x')`. Such specific rules may be confusing and bring additional complexity. I think the rule we do not track coordinates after aggregations would be much simpler and easier to understand. If we adopt the above rule, I think the `argmin` would give just an array of indices, ```python In [1]: import xarray as xr ...: da = xr.DataArray([[0, 3, 2], [2, 1, 4]], dims=['x', 'y'], ...: coords={'x': [1, 2], 'y': ['a', 'b', 'c']}) ...: In [4]: da.argmin(dim='x') Out[4]: <xarray.DataArray (y: 3)> array([0, 1, 0]) Coordinates: * y (y) <U1 'a' 'b' 'c' In [3]: da.isel(x=da.argmin(dim='x')) Out[3]: <xarray.DataArray (y: 3)> array([0, 1, 2]) Coordinates: x (y) int64 1 2 1 * y (y) <U1 'a' 'b' 'c' ``` I think your logic would be useful even we do not track the coordinate. I would appreciate any feedback.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
362717912	https://github.com/pydata/xarray/issues/1388#issuecomment-362717912	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDM2MjcxNzkxMg==	gajomi 244887	2018-02-02T21:50:05Z	2018-02-02T21:50:05Z	CONTRIBUTOR	I just came across the various argmax/idxmax (and related min) related issues recently in a project I have been working on. In addition to agreeing that docs should be updated when appropriate here are my two or three cents: As someone new to xarray I like the idea of having both argmax/argmin and argmax_indices/argmin_indices, with the former returning the coordinate indices and the latter the underlying numpy indices analogous to numpy.argmax/numpy.argmin methods. This makes migrating from numpy ndarrays data and collection of associated index arrays obvious (a common path into the xarray world I think). I can also get that idxmax/idxmin might make a better name given that one can have multi-indexed coordinates. If both argmax and idxmax methods are retained probably good to have docs cross reference. In any case, to respond to @fujiisoup's above proposal, I like the idea of retaining the dimension names in the output, and adding a dimension to hold argmax dims, but think it might make more sense to output a DataArray. By way of example, if I had something like: ```python size = (2,2,2,2) dims = list("wxyz") data = np.random.rand(size) coords = {dim:["{0}_{1}".format(dim,s) for s in range(s)] for dim,s in zip(dims,size)} da = xr.DataArray(data, dims=dims, coords=coords) da <xarray.DataArray (w: 2, x: 2, y: 2, z: 2)> array([[[[ 0.149945, 0.230338], [ 0.626969, 0.299918]], `[[ 0.351764, 0.286436], [ 0.130604, 0.982152]]], [[[ 0.262667, 0.950426], [ 0.76655 , 0.681631]], [[ 0.635468, 0.735071], [ 0.901116, 0.601303]]]])` Coordinates: w (w) <U3 'w_0' 'w_1' * x (x) <U3 'x_0' 'x_1' * y (y) <U3 'y_0' 'y_1' * z (z) <U3 'z_0' 'z_1' ``` I would like to get something like the following ```python argmax(da) <xarray.DataArray '_argmax' (argmaxdim: 4)> array(['w_0', 'x_1', 'y_1', 'z_1'], dtype='<U3') Coordinates: * argmaxdim (argmaxdim) <U1 'w' 'x' 'y' 'z' argmax(da, dim=list("wy")) <xarray.DataArray '_argmax' (x: 2, z: 2, argmaxdim: 2)> array([[['w_1', 'y_1'], ['w_1', 'y_0']], `[['w_1', 'y_1'], ['w_0', 'y_1']]], dtype=object)` Coordinates: * x (x) object 'x_0' 'x_1' * z (z) object 'z_0' 'z_1' * argmaxdim (argmaxdim) <U1 'w' 'y' `where the order of the dims in the unreduced and argmax cases are in the right order as above. For reference, just in case that these examples aren't enough to generalize, a horribly inefficient implementation of above (assuming unique maximum indices):`python def _argmaxstackeddim(dastacked, ind): keepdims = dastacked.indexes['keepdims'].names values = dastacked.keepdims.values[ind] coords = {keepdim:[val] for keepdim,val in zip(keepdims,values)} result = dastacked.sel(keepdims=values)\ .pipe(argmax)\ .expand_dims(keepdims)\ .assign_coords(**coords) return result def argmax(da, dim=None): daname = "" if da.name is None else da.name name = daname+"_argmax" if dim is None: maxda = da.where(da == da.max(),drop=True) dims = list(maxda.dims) dimmaxvals = [maxda.coords[dim].values[0] for dim in dims] result = xr.DataArray(dimmaxvals, dims='argmaxdim', coords={'argmaxdim':dims}, name = name) return result else: if isinstance(dim,str): dim = [dim] keepdims = [d for d in da.dims if d not in dim] dastacked = da.stack(keepdims = keepdims) slices = [_argmaxstackeddim(dastacked,i) for i in range(len(dastacked.keepdims))] return xr.merge(slices)[name] ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
338397633	https://github.com/pydata/xarray/issues/1388#issuecomment-338397633	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDMzODM5NzYzMw==	fujiisoup 6815844	2017-10-21T13:51:04Z	2017-10-21T13:51:04Z	MEMBER	I am thinking again how `argmin` should work with our new vectorizing indexing #1639 . It would be great if `arr.isel(arr.argmin(dim)) == arr.min(dim)` could be satisfied even with a multi-dimensional array, although the behavior is different from `numpy.argmin`. (Maybe our current `min` should be replaced by `arr.isel(arr.argmin(dim))` so that it preserves the coordinates.) (We discussed the name for this new method in #1469 but here I just use `argmin` for the simplicity.) For example with a three dimensional array with `dims=['x', 'y', 'z']`, such as `arr = xr.DataArray(np.random.randn(4, 3, 2), dims=['x', 'y', 'z'])` I am thinking that... + `arr.argmin()` would return a `xr.Dataset` which contains 'x', 'y', 'z' as its `data_vars`. 1. `ds = arr.argmin(dims=None)` case: - `ds['x']`, `ds['y']`, `ds['z']` would be 0d-integers. 2. `ds = arr.argmin(dims=['x', 'y'])` case: - `ds['x']`, `ds['y']`, `ds['z']` would be 1d-integer-arrays. - The dimension of these three arrays would be 'z_argmin', where `ds['z_argmin'] == arr['z']`. 3. `ds = arr.argmin(dims='x')` case: - `ds['x']`, `ds['y']`, `ds['z']` would be 2d-integer-arrays. - The dimensions of these three arrays are 'y_argmin' and 'z_argmin', where `ds['y_argmin'] == arr['y']` and `ds['z_argmin'] == arr['z']`. The above proposal for ii (and iii) is not quite clean, as if it is used as an argument of `isel`, it appends a new coordinate 'z_argmin', which is just a duplicate of 'arr['z']', i.e. `arr.isel(**arr.argmin(dims=['x', 'y']))['z_argmin'] == arr['z']`. Any thoughts are welcome.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
309280411	https://github.com/pydata/xarray/issues/1388#issuecomment-309280411	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDMwOTI4MDQxMQ==	fujiisoup 6815844	2017-06-18T14:18:40Z	2017-06-19T12:56:03Z	MEMBER	I'm working to fix this and I would like to make some design decisions; What should `max()` look like? I guess this method should work also for multi-dimensional data. To satisfy the `arr.isel_points(arr.argmin_indices(dim)) == arg.min(dim)` relation, the result array should have proper coordinates? Multiple `dim` arguments Currently, doc** says `argmin` accepts multiple axes, but `np.argmin` does not. Can we limit `argmin`'s arguments only `str` not sequence of `str`s? Edit: Multi-dimensional array to `isel_points` Currently, `isel_points` only accepts 1-dimensional array, while the result of `argmin_indexes` can be multi-dimensional, e.g. `python xr.DataArray(np.random.randn(4, 3, 2), dims=['x', 'y', 'z']).argmin_indexes(dims=['x'])` Do we need special treatment for this (maybe in `isel_points`) or just raise an Error (current behavior)?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
298256260	https://github.com/pydata/xarray/issues/1388#issuecomment-298256260	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDI5ODI1NjI2MA==	shoyer 1217238	2017-04-30T20:50:49Z	2017-04-30T20:50:49Z	MEMBER	I agree that `arr[arr.argmin(dim)] == arr.min(dim)` is a useful invariant, but currently xarray's indexing works a little differently from NumPy. Probably `arr.isel_points(*arr.argmin(dim)) == arg.min(dim)` is the better invariant for now, and in the future `arr[arr.argmin(dim)]` will change to work consistently (#974). The main downside of returning a tuple or dict from `argmin()` is that it makes the common case of taking the max/min over one dimension a little harder. So possibly it would be better to write two methods*: `argmin` would work like it does currently (returning an xarray object), but error if reducing over multiple dimensions. `argmin_indices` would return a dict suitable for use in indexing.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
298253809	https://github.com/pydata/xarray/issues/1388#issuecomment-298253809	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDI5ODI1MzgwOQ==	lamorton 23484003	2017-04-30T20:08:25Z	2017-04-30T20:08:25Z	NONE	Well, `xarray` at least agrees with `numpy`'s implementation of that function, but that's not to say it is 'correct.' It would be nice if `numpy.argmin` worked intuitively. That aside, it seems to me that applying `min()` to a `xr.DataArray` should return a reduced array with length 1 in each dimension; then you could just query this object and find the coordinate/dimension values. Perhaps then `argmin()` would just get a tuple of axis indices, such that `arr[*arr.argmin()] == arr.min()` would hold. The next question is, what happens if you start supplying coordinate/dimension optional arguments to `argmin`? It doesn't make sense to minimize over a coordinate, so only dimensions should be accepted. This should result in a tuple of lists, the way `numpy.where` does. Does that seem reasonable?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728
297845435	https://github.com/pydata/xarray/issues/1388#issuecomment-297845435	https://api.github.com/repos/pydata/xarray/issues/1388	MDEyOklzc3VlQ29tbWVudDI5Nzg0NTQzNQ==	shoyer 1217238	2017-04-27T21:33:06Z	2017-04-27T21:33:06Z	MEMBER	Agreed, the current implementation of `argmin()` only gives the correct result when given one dimension. It's not entirely obvious what `argmin()` should yield when done to multiple dimensions, certainly this is not very useful. I would prefer raising an error to this current behavior.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	argmin / argmax behavior doesn't match documentation 224878728

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);