html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3810#issuecomment-973623524,https://api.github.com/repos/pydata/xarray/issues/3810,973623524,IC_kwDOAMm_X846CFDk,25071375,2021-11-19T01:00:11Z,2021-11-19T15:09:10Z,CONTRIBUTOR,"Is it possible to add the option of modifying what happens when there is a tie in the rank? (If you want I can create a separate issue for this)

I think this can be done using the scipy rankdata function instead of the bottleneck rank (but also I think that adding the method option for the bottleneck package is also possible).

Small example:
```py

arr = xarray.DataArray(
    dask.array.random.random((11, 10), chunks=(3, 2)),
    coords={'a': list(range(11)), 'b': list(range(10))}
)

def rank(x: xarray.DataArray, dim: str, method: str):
    # This option generate less tasks, I don't know why

    axis = x.dims.index(dim)
    return xarray.DataArray(
        dask.array.apply_along_axis(
            rankdata,
            axis,
            x.data,
            dtype=float,
            shape=(x.sizes[dim], ),
            method=method
        ),
        coords=x.coords,
        dims=x.dims
    )


def rank2(x: xarray.DataArray, dim: str, method: str):
    from scipy.stats import rankdata
    
    axis = x.dims.index(dim)
    return xarray.apply_ufunc(
        rankdata,
        x.chunk({dim: x.sizes[dim]}),
        dask='parallelized',
        kwargs={'method': method, 'axis': axis},
        meta=x.data._meta
    )

arr_rank1 = rank(arr, 'a', 'ordinal')
arr_rank2 = rank2(arr, 'a', 'ordinal')

assert arr_rank1.equals(arr_rank2)
```

```py
# Probably this can work for ranking arrays with nan values
def _nanrankdata1(a, method):
    y = np.empty(a.shape, dtype=np.float64)
    y.fill(np.nan)
    idx = ~np.isnan(a)
    y[idx] = rankdata(a[idx], method=method)
    return y

```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592738965,https://api.github.com/repos/pydata/xarray/issues/3810,592738965,MDEyOklzc3VlQ29tbWVudDU5MjczODk2NQ==,5635139,2020-02-28T21:33:35Z,2020-02-28T21:33:35Z,MEMBER,"Yeah, unfortunately I'm fairly confident about this; have a go with moderately large arrays for `sum` and you'll quickly see the performance cliff ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592737661,https://api.github.com/repos/pydata/xarray/issues/3810,592737661,MDEyOklzc3VlQ29tbWVudDU5MjczNzY2MQ==,7441788,2020-02-28T21:29:58Z,2020-02-28T21:31:31Z,CONTRIBUTOR,"Note that with the `apply_ufunc` implementation we're only reshaping `dims`-sized `ndarray`s, not (necessarily) the whole DataArray, so maybe it's not too bad? It might be better to first sort `dims` to be in the same order as `self.dims`. i.e. `dims = [dim_ for dim_ in self.dims if dim_ in dims]`. But I'm just speculating.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592721162,https://api.github.com/repos/pydata/xarray/issues/3810,592721162,MDEyOklzc3VlQ29tbWVudDU5MjcyMTE2Mg==,5635139,2020-02-28T20:47:33Z,2020-02-28T20:47:33Z,MEMBER,"Great -- that's cool and a good implementation of `apply_ufunc`. As above, we wouldn't want to replace `rank` with that given the reshaping (we'd need a function that computes over multiple dimensions)

We could use something similar for groupbys though?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592715925,https://api.github.com/repos/pydata/xarray/issues/3810,592715925,MDEyOklzc3VlQ29tbWVudDU5MjcxNTkyNQ==,7441788,2020-02-28T20:33:43Z,2020-02-28T20:35:57Z,CONTRIBUTOR,"A few minor tweaks needed:
```
In [20]: import bottleneck        

In [21]: xr.apply_ufunc( 
    ...:     lambda x: bottleneck.rankdata(x).reshape(x.shape), 
    ...:     d, 
    ...:     input_core_dims=[['xyz', 'abc']], 
    ...:     output_core_dims=[['xyz', 'abc']], 
    ...:     vectorize=True 
    ...: ).transpose(*d.dims)                                                                                                                                                                                                                      
Out[21]: 
<xarray.DataArray (abc: 4, xyz: 3)>
array([[ 1.,  2.,  3.],
       [ 4.,  5.,  6.],
       [ 7.,  8.,  9.],
       [10., 11., 12.]])
Dimensions without coordinates: abc, xyz
```

Despite what the docs say, `bottleneck.{nan}rankdata(a)` returns a 1-dimensional ndarray, not an array with the same shape as `a`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592708353,https://api.github.com/repos/pydata/xarray/issues/3810,592708353,MDEyOklzc3VlQ29tbWVudDU5MjcwODM1Mw==,5635139,2020-02-28T20:13:51Z,2020-02-28T20:13:51Z,MEMBER,Could you try running that?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592672463,https://api.github.com/repos/pydata/xarray/issues/3810,592672463,MDEyOklzc3VlQ29tbWVudDU5MjY3MjQ2Mw==,7441788,2020-02-28T18:51:18Z,2020-02-28T18:52:29Z,CONTRIBUTOR,"What's wrong with the following? (Still need to deal with `pct` and `keep_attrs`.)
 ````
apply_ufunc(
    bottleneck.{nan}rankdata,
    self,
    input_core_dims=[dims],
    output_core_dims=[dims],
    vectorize=True
)
 ````

Per https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata, ""The default (axis=None) is to rank the elements of the flattened array.""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592665711,https://api.github.com/repos/pydata/xarray/issues/3810,592665711,MDEyOklzc3VlQ29tbWVudDU5MjY2NTcxMQ==,5635139,2020-02-28T18:34:44Z,2020-02-28T18:34:44Z,MEMBER,"Yes, we can always reshape as a way of running numerical operations over multiple dimensions. But reshaping can be an expensive operation, so doing it as part of a numerical operation can cause surprises. (if you're interested, try running a sum over multiple dimensions and comparing to a reshape + a sum over the single reshaped dimension). 

Instead, users can do this themselves, giving them context and control.

Reshaping is OK to do in `groupby` though (I think), so adding `rank` to groupby would be one way of accomplishing this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592654794,https://api.github.com/repos/pydata/xarray/issues/3810,592654794,MDEyOklzc3VlQ29tbWVudDU5MjY1NDc5NA==,7441788,2020-02-28T18:06:57Z,2020-02-28T18:06:57Z,CONTRIBUTOR,"Assuming `dims` is a non-empty list of dimensions, the following code seems to work:
```
    temp_dim = '__temp_dim__'
    return da.stack(**{temp_dim: dims}).\
        rank(temp_dim, pct=pct, keep_attrs=keep_attrs).\
        unstack(temp_dim).transpose(*da.dims).\
        drop_vars([dim_ for dim_ in dims if dim_ not in da.coords])
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592645335,https://api.github.com/repos/pydata/xarray/issues/3810,592645335,MDEyOklzc3VlQ29tbWVudDU5MjY0NTMzNQ==,5635139,2020-02-28T17:43:05Z,2020-02-28T17:43:05Z,MEMBER,"This would be great. The underlying numerical library we use, bottleneck, [doesn't support multiple dimensions](https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata). If there were another option, or someone wanted to write one in numbagg, that would be a welcome addition.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480