github: issue_comments: 10 rows where author_association = "CONTRIBUTOR" and issue = 374025325 sorted by updated

10 rows where author_association = "CONTRIBUTOR" and issue = 374025325 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
944328081	https://github.com/pydata/xarray/issues/2511#issuecomment-944328081	https://api.github.com/repos/pydata/xarray/issues/2511	IC_kwDOAMm_X844SU2R	bzah 16700639	2021-10-15T14:03:21Z	2021-10-15T14:03:21Z	CONTRIBUTOR	I'll drop a PR, it might be easier to try and play with this than a piece of code lost in an issue.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
931430066	https://github.com/pydata/xarray/issues/2511#issuecomment-931430066	https://api.github.com/repos/pydata/xarray/issues/2511	IC_kwDOAMm_X843hH6y	bzah 16700639	2021-09-30T15:30:02Z	2021-10-06T09:48:19Z	CONTRIBUTOR	Okay I could re do my test. If I manually call `compute()` before doing `isel(......)` my whole computation takes about 5.65 seconds. However if I try with my naive patch it takes 32.34 seconds. I'm sorry I cannot share as is my code, the relevant portion is really in the middle of many things. I'll try to get a minimalist version of it to share with you.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
930153816	https://github.com/pydata/xarray/issues/2511#issuecomment-930153816	https://api.github.com/repos/pydata/xarray/issues/2511	IC_kwDOAMm_X843cQVY	bzah 16700639	2021-09-29T13:02:15Z	2021-10-06T09:46:10Z	CONTRIBUTOR	@pl-marasco Ok that's strange. I should have saved my use case :/ I will try to reproduce it and will provide a gist of it soon.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
932229595	https://github.com/pydata/xarray/issues/2511#issuecomment-932229595	https://api.github.com/repos/pydata/xarray/issues/2511	IC_kwDOAMm_X843kLHb	bzah 16700639	2021-10-01T13:29:32Z	2021-10-01T13:29:32Z	CONTRIBUTOR	@pl-marasco Thanks for the example ! With it I have the same result as you, it takes the same time with patch or with compute. However, I could construct an example giving very different results. It is quite close to my original code: ``` time_start = time.perf_counter() COORDS = dict( time=pd.date_range("2042-01-01", periods=200, freq=pd.DateOffset(days=1)), ) da = xr.DataArray( np.random.rand(200 * 3500 * 350).reshape((200, 3500, 350)), dims=('time', 'x', 'y'), coords=COORDS ).chunk(dict(time=-1, x=100, y=100)) `resampled = da.resample(time="MS") for label, sample in resampled: # sample = sample.compute() idx = sample.argmax('time') sample.isel(time=idx) time_elapsed = time.perf_counter() - time_start print(time_elapsed, " secs")` ``` (Basically I want for each month the first event occurring in it). Without the patch and uncommenting `sample = sample.compute()`, it takes 5.7 secs. With the patch it takes 53.9 seconds.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
922942743	https://github.com/pydata/xarray/issues/2511#issuecomment-922942743	https://api.github.com/repos/pydata/xarray/issues/2511	IC_kwDOAMm_X843Av0X	bzah 16700639	2021-09-20T13:45:56Z	2021-09-20T13:45:56Z	CONTRIBUTOR	I wrote a very naive fix, it works but seems to perform really slowly, I would appreciate some feedback (I'm a beginner with Dask). Basically, I added `k = dask.array.asarray(k, dtype=np.int64)` to do the exact same thing as with numpy. I can create a PR if it's better to review this The patch: ``` class VectorizedIndexer(ExplicitIndexer): """Tuple for vectorized indexing. All elements should be slice or N-dimensional np.ndarray objects with an integer dtype and the same number of dimensions. Indexing follows proposed rules for np.ndarray.vindex, which matches NumPy's advanced indexing rules (including broadcasting) except sliced axes are always moved to the end: https://github.com/numpy/numpy/pull/6256 """ __slots__ = () def __init__(self, key): if not isinstance(key, tuple): raise TypeError(f"key must be a tuple: {key!r}") new_key = [] ndim = None for k in key: if isinstance(k, slice): k = as_integer_slice(k) elif isinstance(k, np.ndarray) or isinstance(k, dask.array.Array): if not np.issubdtype(k.dtype, np.integer): raise TypeError( f"invalid indexer array, does not have integer dtype: {k!r}" ) if ndim is None: ndim = k.ndim elif ndim != k.ndim: ndims = [k.ndim for k in key if isinstance(k, np.ndarray)] raise ValueError( "invalid indexer key: ndarray arguments " f"have different numbers of dimensions: {ndims}" ) if isinstance(k, dask.array.Array): k = dask.array.asarray(k, dtype=np.int64) else: k = np.asarray(k, dtype=np.int64) else: raise TypeError( f"unexpected indexer type for {type(self).__name__}: {k!r}" ) new_key.append(k) super().__init__(new_key) ```	{ "total_count": 2, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 2, "eyes": 0 }	Array indexing with dask arrays 374025325
563330352	https://github.com/pydata/xarray/issues/2511#issuecomment-563330352	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDU2MzMzMDM1Mg==	rafa-guedes 7799184	2019-12-09T16:53:38Z	2019-12-09T16:53:38Z	CONTRIBUTOR	I'm having similar issue, here is an example: ``` import numpy as np import dask.array as da import xarray as xr darr = xr.DataArray(data=[0.2, 0.4, 0.6], coords={"z": range(3)}, dims=("z",)) good_indexer = xr.DataArray( data=np.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) bad_indexer = xr.DataArray( data=da.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) In [5]: darr Out[5]: <xarray.DataArray (z: 3)> array([0.2, 0.4, 0.6]) Coordinates: * z (z) int64 0 1 2 In [6]: good_indexer Out[6]: <xarray.DataArray (y: 4, x: 2)> array([[0, 1], [2, 2], [1, 2], [1, 0]]) Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [7]: bad_indexer Out[7]: <xarray.DataArray 'reshape-417766b2035dcb1227ddde8505297039' (y: 4, x: 2)> dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [8]: darr[good_indexer] Out[8]: <xarray.DataArray (y: 4, x: 2)> array([[0.2, 0.4], [0.6, 0.6], [0.4, 0.6], [0.4, 0.2]]) Coordinates: z (y, x) int64 0 1 2 2 1 2 1 0 * y (y) int64 0 1 2 3 * x (x) int64 0 1 In [9]: darr[bad_indexer] TypeError Traceback (most recent call last) <ipython-input-8-2a57c1a2eade> in <module> ----> 1 darr[bad_indexer] ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataarray.py in getitem(self, key) 638 else: 639 # xarray-style array indexing --> 640 return self.isel(indexers=self._item_key_to_dict(key)) 641 642 def setitem(self, key: Any, value: Any) -> None: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataarray.py in isel(self, indexers, drop, indexers_kwargs) 1012 """ 1013 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "isel") -> 1014 ds = self._to_temp_dataset().isel(drop=drop, indexers=indexers) 1015 return self._from_temp_dataset(ds) 1016 ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/dataset.py in isel(self, indexers, drop, indexers_kwargs) 1920 if name in self.indexes: 1921 new_var, new_index = isel_variable_and_index( -> 1922 name, var, self.indexes[name], var_indexers 1923 ) 1924 if new_index is not None: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/indexes.py in isel_variable_and_index(name, variable, index, indexers) 79 ) 80 ---> 81 new_variable = variable.isel(indexers) 82 83 if new_variable.dims != (name,): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in isel(self, indexers, indexers_kwargs) 1052 1053 key = tuple(indexers.get(dim, slice(None)) for dim in self.dims) -> 1054 return self[key] 1055 1056 def squeeze(self, dim=None): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in getitem(self, key) 700 array `x.values` directly. 701 """ --> 702 dims, indexer, new_order = self._broadcast_indexes(key) 703 data = as_indexable(self._data)[indexer] 704 if new_order: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in _broadcast_indexes(self, key) 557 if isinstance(k, Variable): 558 if len(k.dims) > 1: --> 559 return self._broadcast_indexes_vectorized(key) 560 dims.append(k.dims[0]) 561 elif not isinstance(k, integer_types): ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/variable.py in _broadcast_indexes_vectorized(self, key) 685 new_order = None 686 --> 687 return out_dims, VectorizedIndexer(tuple(out_key)), new_order 688 689 def getitem(self: VariableType, key) -> VariableType: ~/.virtualenvs/py3/local/lib/python3.7/site-packages/xarray/core/indexing.py in init**(self, key) 447 else: 448 raise TypeError( --> 449 f"unexpected indexer type for {type(self).name}: {k!r}" 450 ) 451 new_key.append(k) TypeError: unexpected indexer type for VectorizedIndexer: dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> In [10]: xr.version Out[10]: '0.14.1' In [11]: import dask; dask.version Out[11]: '2.9.0' ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
525634152	https://github.com/pydata/xarray/issues/2511#issuecomment-525634152	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDUyNTYzNDE1Mg==	ulijh 13190237	2019-08-28T08:12:13Z	2019-08-28T08:12:13Z	CONTRIBUTOR	I think the problem is somewhere here: https://github.com/pydata/xarray/blob/aaeea6250b89e3605ee1d1a160ad50d6ed657c7e/xarray/core/utils.py#L85-L103 I don't think `pandas.Index` can hold lazy arrays. Could there be a way around exploiting `dask.dataframe` indexing methods?	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
522986699	https://github.com/pydata/xarray/issues/2511#issuecomment-522986699	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDUyMjk4NjY5OQ==	ulijh 13190237	2019-08-20T12:15:18Z	2019-08-20T18:52:49Z	CONTRIBUTOR	Even though the example from above does work, sadly, the following does not: `python import xarray as xr import dask.array as da import numpy as np da = xr.DataArray(np.random.rand(345).reshape((3,4,5))).chunk(dict(dim_0=1)) idcs = da.argmax('dim_2') da[dict(dim_2=idcs)]` results in ``` python TypeError Traceback (most recent call last) <ipython-input-4-3542cdd6d61c> in <module> ----> 1 da[dict(dim_2=idcs)] ~/src/xarray/xarray/core/dataarray.py in getitem(self, key) 604 else: 605 # xarray-style array indexing --> 606 return self.isel(indexers=self._item_key_to_dict(key)) 607 608 def setitem(self, key: Any, value: Any) -> None: ~/src/xarray/xarray/core/dataarray.py in isel(self, indexers, drop, indexers_kwargs) 986 """ 987 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, "isel") --> 988 ds = self._to_temp_dataset().isel(drop=drop, indexers=indexers) 989 return self._from_temp_dataset(ds) 990 ~/src/xarray/xarray/core/dataset.py in isel(self, indexers, drop, indexers_kwargs) 1901 indexes[name] = new_index 1902 else: -> 1903 new_var = var.isel(indexers=var_indexers) 1904 1905 variables[name] = new_var ~/src/xarray/xarray/core/variable.py in isel(self, indexers, drop, indexers_kwargs) 984 if dim in indexers: 985 key[i] = indexers[dim] --> 986 return self[tuple(key)] 987 988 def squeeze(self, dim=None): ~/src/xarray/xarray/core/variable.py in getitem(self, key) 675 array `x.values` directly. 676 """ --> 677 dims, indexer, new_order = self._broadcast_indexes(key) 678 data = as_indexable(self._data)[indexer] 679 if new_order: ~/src/xarray/xarray/core/variable.py in _broadcast_indexes(self, key) 532 if isinstance(k, Variable): 533 if len(k.dims) > 1: --> 534 return self._broadcast_indexes_vectorized(key) 535 dims.append(k.dims[0]) 536 elif not isinstance(k, integer_types): ~/src/xarray/xarray/core/variable.py in _broadcast_indexes_vectorized(self, key) 660 new_order = None 661 --> 662 return out_dims, VectorizedIndexer(tuple(out_key)), new_order 663 664 def getitem(self, key): ~/src/xarray/xarray/core/indexing.py in init**(self, key) 460 raise TypeError( 461 "unexpected indexer type for {}: {!r}".format( --> 462 type(self).name, k 463 ) 464 ) TypeError: unexpected indexer type for VectorizedIndexer: dask.array<arg_agg-aggregate, shape=(3, 4), dtype=int64, chunksize=(1, 4)> ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
498178025	https://github.com/pydata/xarray/issues/2511#issuecomment-498178025	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDQ5ODE3ODAyNQ==	ulijh 13190237	2019-06-03T09:13:49Z	2019-06-03T09:13:49Z	CONTRIBUTOR	As of version 0.12 indexing with dask arrays works out of the box... I think this can be closed now.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325
433304954	https://github.com/pydata/xarray/issues/2511#issuecomment-433304954	https://api.github.com/repos/pydata/xarray/issues/2511	MDEyOklzc3VlQ29tbWVudDQzMzMwNDk1NA==	ulijh 13190237	2018-10-26T06:48:54Z	2018-10-26T06:48:54Z	CONTRIBUTOR	It seem's working fine with the following change but it has a lot of dublicated code... ``` diff --git a/xarray/core/indexing.py b/xarray/core/indexing.py index d51da471..9fe93581 100644 --- a/xarray/core/indexing.py +++ b/xarray/core/indexing.py @@ -7,6 +7,7 @@ from datetime import timedelta import numpy as np import pandas as pd +import dask.array as da from . import duck_array_ops, nputils, utils from .pycompat import ( @@ -420,6 +421,19 @@ class VectorizedIndexer(ExplicitIndexer): 'have different numbers of dimensions: {}' .format(ndims)) k = np.asarray(k, dtype=np.int64) + elif isinstance(k, dask_array_type): + if not np.issubdtype(k.dtype, np.integer): + raise TypeError('invalid indexer array, does not have ' + 'integer dtype: {!r}'.format(k)) + if ndim is None: + ndim = k.ndim + elif ndim != k.ndim: + ndims = [k.ndim for k in key + if isinstance(k, (np.ndarray) + dask_array_type)] + raise ValueError('invalid indexer key: ndarray arguments ' + 'have different numbers of dimensions: {}' + .format(ndims)) + k = da.array(k, dtype=np.int64) else: raise TypeError('unexpected indexer type for {}: {!r}' .format(type(self).name, k)) ```	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Array indexing with dask arrays 374025325

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

10 rows where author_association = "CONTRIBUTOR" and issue = 374025325 sorted by updated_at descending

In [9]: darr[bad_indexer]

Advanced export