github: issue_comments: 10 rows where issue = 572875480 sorted by updated

10 rows where issue = 572875480 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
973623524	https://github.com/pydata/xarray/issues/3810#issuecomment-973623524	https://api.github.com/repos/pydata/xarray/issues/3810	IC_kwDOAMm_X846CFDk	josephnowak 25071375	2021-11-19T01:00:11Z	2021-11-19T15:09:10Z	CONTRIBUTOR	Is it possible to add the option of modifying what happens when there is a tie in the rank? (If you want I can create a separate issue for this) I think this can be done using the scipy rankdata function instead of the bottleneck rank (but also I think that adding the method option for the bottleneck package is also possible). Small example: ```py arr = xarray.DataArray( dask.array.random.random((11, 10), chunks=(3, 2)), coords={'a': list(range(11)), 'b': list(range(10))} ) def rank(x: xarray.DataArray, dim: str, method: str): # This option generate less tasks, I don't know why `axis = x.dims.index(dim) return xarray.DataArray( dask.array.apply_along_axis( rankdata, axis, x.data, dtype=float, shape=(x.sizes[dim], ), method=method ), coords=x.coords, dims=x.dims )` def rank2(x: xarray.DataArray, dim: str, method: str): from scipy.stats import rankdata `axis = x.dims.index(dim) return xarray.apply_ufunc( rankdata, x.chunk({dim: x.sizes[dim]}), dask='parallelized', kwargs={'method': method, 'axis': axis}, meta=x.data._meta )` arr_rank1 = rank(arr, 'a', 'ordinal') arr_rank2 = rank2(arr, 'a', 'ordinal') assert arr_rank1.equals(arr_rank2) ``` ```py Probably this can work for ranking arrays with nan values def _nanrankdata1(a, method): y = np.empty(a.shape, dtype=np.float64) y.fill(np.nan) idx = ~np.isnan(a) y[idx] = rankdata(a[idx], method=method) return y ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592738965	https://github.com/pydata/xarray/issues/3810#issuecomment-592738965	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjczODk2NQ==	max-sixty 5635139	2020-02-28T21:33:35Z	2020-02-28T21:33:35Z	MEMBER	Yeah, unfortunately I'm fairly confident about this; have a go with moderately large arrays for `sum` and you'll quickly see the performance cliff	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592737661	https://github.com/pydata/xarray/issues/3810#issuecomment-592737661	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjczNzY2MQ==	seth-p 7441788	2020-02-28T21:29:58Z	2020-02-28T21:31:31Z	CONTRIBUTOR	Note that with the `apply_ufunc` implementation we're only reshaping `dims`-sized `ndarray`s, not (necessarily) the whole DataArray, so maybe it's not too bad? It might be better to first sort `dims` to be in the same order as `self.dims`. i.e. `dims = [dim_ for dim_ in self.dims if dim_ in dims]`. But I'm just speculating.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592721162	https://github.com/pydata/xarray/issues/3810#issuecomment-592721162	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjcyMTE2Mg==	max-sixty 5635139	2020-02-28T20:47:33Z	2020-02-28T20:47:33Z	MEMBER	Great -- that's cool and a good implementation of `apply_ufunc`. As above, we wouldn't want to replace `rank` with that given the reshaping (we'd need a function that computes over multiple dimensions) We could use something similar for groupbys though?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592715925	https://github.com/pydata/xarray/issues/3810#issuecomment-592715925	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjcxNTkyNQ==	seth-p 7441788	2020-02-28T20:33:43Z	2020-02-28T20:35:57Z	CONTRIBUTOR	A few minor tweaks needed: ``` In [20]: import bottleneck In [21]: xr.apply_ufunc( ...: lambda x: bottleneck.rankdata(x).reshape(x.shape), ...: d, ...: input_core_dims=[['xyz', 'abc']], ...: output_core_dims=[['xyz', 'abc']], ...: vectorize=True ...: ).transpose(*d.dims) Out[21]: <xarray.DataArray (abc: 4, xyz: 3)> array([[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.], [10., 11., 12.]]) Dimensions without coordinates: abc, xyz ``` Despite what the docs say, `bottleneck.{nan}rankdata(a)` returns a 1-dimensional ndarray, not an array with the same shape as `a`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592708353	https://github.com/pydata/xarray/issues/3810#issuecomment-592708353	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjcwODM1Mw==	max-sixty 5635139	2020-02-28T20:13:51Z	2020-02-28T20:13:51Z	MEMBER	Could you try running that?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592672463	https://github.com/pydata/xarray/issues/3810#issuecomment-592672463	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjY3MjQ2Mw==	seth-p 7441788	2020-02-28T18:51:18Z	2020-02-28T18:52:29Z	CONTRIBUTOR	What's wrong with the following? (Still need to deal with `pct` and `keep_attrs`.) `apply_ufunc( bottleneck.{nan}rankdata, self, input_core_dims=[dims], output_core_dims=[dims], vectorize=True )` Per https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata, "The default (axis=None) is to rank the elements of the flattened array."	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592665711	https://github.com/pydata/xarray/issues/3810#issuecomment-592665711	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjY2NTcxMQ==	max-sixty 5635139	2020-02-28T18:34:44Z	2020-02-28T18:34:44Z	MEMBER	Yes, we can always reshape as a way of running numerical operations over multiple dimensions. But reshaping can be an expensive operation, so doing it as part of a numerical operation can cause surprises. (if you're interested, try running a sum over multiple dimensions and comparing to a reshape + a sum over the single reshaped dimension). Instead, users can do this themselves, giving them context and control. Reshaping is OK to do in `groupby` though (I think), so adding `rank` to groupby would be one way of accomplishing this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592654794	https://github.com/pydata/xarray/issues/3810#issuecomment-592654794	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjY1NDc5NA==	seth-p 7441788	2020-02-28T18:06:57Z	2020-02-28T18:06:57Z	CONTRIBUTOR	Assuming `dims` is a non-empty list of dimensions, the following code seems to work: `temp_dim = '__temp_dim__' return da.stack(*{temp_dim: dims}).\ rank(temp_dim, pct=pct, keep_attrs=keep_attrs).\ unstack(temp_dim).transpose(da.dims).\ drop_vars([dim_ for dim_ in dims if dim_ not in da.coords])`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480
592645335	https://github.com/pydata/xarray/issues/3810#issuecomment-592645335	https://api.github.com/repos/pydata/xarray/issues/3810	MDEyOklzc3VlQ29tbWVudDU5MjY0NTMzNQ==	max-sixty 5635139	2020-02-28T17:43:05Z	2020-02-28T17:43:05Z	MEMBER	This would be great. The underlying numerical library we use, bottleneck, doesn't support multiple dimensions. If there were another option, or someone wanted to write one in numbagg, that would be a welcome addition.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	{DataArray,Dataset}.rank() should support an optional list of dimensions 572875480

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

10 rows where issue = 572875480 sorted by updated_at descending

Probably this can work for ranking arrays with nan values

Advanced export