home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 362717912

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1388#issuecomment-362717912 https://api.github.com/repos/pydata/xarray/issues/1388 362717912 MDEyOklzc3VlQ29tbWVudDM2MjcxNzkxMg== 244887 2018-02-02T21:50:05Z 2018-02-02T21:50:05Z CONTRIBUTOR

I just came across the various argmax/idxmax (and related min) related issues recently in a project I have been working on. In addition to agreeing that docs should be updated when appropriate here are my two or three cents:

  • As someone new to xarray I like the idea of having both argmax/argmin and argmax_indices/argmin_indices, with the former returning the coordinate indices and the latter the underlying numpy indices analogous to numpy.argmax/numpy.argmin methods. This makes migrating from numpy ndarrays data and collection of associated index arrays obvious (a common path into the xarray world I think).
  • I can also get that idxmax/idxmin might make a better name given that one can have multi-indexed coordinates. If both argmax and idxmax methods are retained probably good to have docs cross reference.
  • In any case, to respond to @fujiisoup's above proposal, I like the idea of retaining the dimension names in the output, and adding a dimension to hold argmax dims, but think it might make more sense to output a DataArray. By way of example, if I had something like: ```python size = (2,2,2,2) dims = list("wxyz") data = np.random.rand(*size) coords = {dim:["{0}_{1}".format(dim,s) for s in range(s)] for dim,s in zip(dims,size)} da = xr.DataArray(data, dims=dims, coords=coords)

da <xarray.DataArray (w: 2, x: 2, y: 2, z: 2)> array([[[[ 0.149945, 0.230338], [ 0.626969, 0.299918]],

    [[ 0.351764,  0.286436],
     [ 0.130604,  0.982152]]],


   [[[ 0.262667,  0.950426],
     [ 0.76655 ,  0.681631]],

    [[ 0.635468,  0.735071],
     [ 0.901116,  0.601303]]]])

Coordinates: * w (w) <U3 'w_0' 'w_1' * x (x) <U3 'x_0' 'x_1' * y (y) <U3 'y_0' 'y_1' * z (z) <U3 'z_0' 'z_1' ```

I would like to get something like the following

```python

argmax(da) <xarray.DataArray '_argmax' (argmaxdim: 4)> array(['w_0', 'x_1', 'y_1', 'z_1'], dtype='<U3') Coordinates: * argmaxdim (argmaxdim) <U1 'w' 'x' 'y' 'z'

argmax(da, dim=list("wy")) <xarray.DataArray '_argmax' (x: 2, z: 2, argmaxdim: 2)> array([[['w_1', 'y_1'], ['w_1', 'y_0']],

   [['w_1', 'y_1'],
    ['w_0', 'y_1']]], dtype=object)

Coordinates: * x (x) object 'x_0' 'x_1' * z (z) object 'z_0' 'z_1' * argmaxdim (argmaxdim) <U1 'w' 'y' where the order of the dims in the unreduced and argmax cases are in the right order as above. For reference, just in case that these examples aren't enough to generalize, a horribly inefficient implementation of above (assuming unique maximum indices):python def _argmaxstackeddim(dastacked, ind): keepdims = dastacked.indexes['keepdims'].names values = dastacked.keepdims.values[ind] coords = {keepdim:[val] for keepdim,val in zip(keepdims,values)} result = dastacked.sel(keepdims=values)\ .pipe(argmax)\ .expand_dims(keepdims)\ .assign_coords(**coords) return result

def argmax(da, dim=None): daname = "" if da.name is None else da.name name = daname+"_argmax" if dim is None: maxda = da.where(da == da.max(),drop=True) dims = list(maxda.dims) dimmaxvals = [maxda.coords[dim].values[0] for dim in dims] result = xr.DataArray(dimmaxvals, dims='argmaxdim', coords={'argmaxdim':dims}, name = name) return result else: if isinstance(dim,str): dim = [dim] keepdims = [d for d in da.dims if d not in dim] dastacked = da.stack(keepdims = keepdims) slices = [_argmaxstackeddim(dastacked,i) for i in range(len(dastacked.keepdims))] return xr.merge(slices)[name] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  224878728
Powered by Datasette · Queries took 0.457ms · About: xarray-datasette