html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/5873#issuecomment-1472135607,https://api.github.com/repos/pydata/xarray/issues/5873,1472135607,IC_kwDOAMm_X85XvwG3,16700639,2023-03-16T14:54:51Z,2023-03-16T14:54:51Z,CONTRIBUTOR,"Hey, thanks @dcherian for taking over and merging this PR ! (and sorry for not being active on it myself for the past year...)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1028755077 https://github.com/pydata/xarray/pull/6059#issuecomment-1033607850,https://api.github.com/repos/pydata/xarray/issues/6059,1033607850,IC_kwDOAMm_X849m5qq,16700639,2022-02-09T10:30:40Z,2022-02-09T10:40:08Z,CONTRIBUTOR,"@mathause This PR goes beyond what is currently implemented in numpy. For now, all weighted quantiles PR on numpy are more or less based on ""linear"" method (method 7) and none have been merged. I plan to work on integrating weights with the other interpolation methods but don't have the time right now. I'll probably pick some ideas from here. As for the _numerics_ here, everything looks good to me. The only limitations I can see are: - This only handles sampling weights, which is fine I guess. - Some interpolation methods are missing, they can be added later. - ~A `nan_weighted_quantile` could also be interesting to add~ ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1076265104 https://github.com/pydata/xarray/pull/6059#issuecomment-1021975396,https://api.github.com/repos/pydata/xarray/issues/6059,1021975396,IC_kwDOAMm_X8486htk,16700639,2022-01-26T08:32:56Z,2022-02-07T16:57:56Z,CONTRIBUTOR,"FYI, weighted quantiles topic will be discussed in numpy's triage meeting of today (17:00 UTC). I'm not a maintainer but I'm sure you are welcomed to join in if you are interested. Meeting information: https://hackmd.io/68i_JvOYQfy9ERiHgXMPvg","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1076265104 https://github.com/pydata/xarray/pull/6068#issuecomment-991632287,https://api.github.com/repos/pydata/xarray/issues/6068,991632287,IC_kwDOAMm_X847Gxuf,16700639,2021-12-11T12:49:15Z,2021-12-11T12:49:15Z,CONTRIBUTOR,"`list` type is only used on this handler. Its usage is never documented and it's not even part of [dask `rechunk` api](https://docs.dask.org/en/stable/generated/dask.array.rechunk.html#dask.array.rechunk) either. It would make sense to deprecate it I guess.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1077040836 https://github.com/pydata/xarray/pull/6068#issuecomment-991611195,https://api.github.com/repos/pydata/xarray/issues/6068,991611195,IC_kwDOAMm_X847Gsk7,16700639,2021-12-11T11:39:43Z,2021-12-11T11:39:57Z,CONTRIBUTOR,"I also noticed the Tuples types `Tuple[int, ...]` and `Tuple[Tuple[int, ...], ...],` are accepted and handled at DataArray level but are not at Dataset level. The handler is: ``` if isinstance(chunks, (tuple, list)): chunks = dict(zip(self.dims, chunks)) ``` Would it make sense to move this in Dataset to have the same api for both ? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1077040836 https://github.com/pydata/xarray/issues/2511#issuecomment-944328081,https://api.github.com/repos/pydata/xarray/issues/2511,944328081,IC_kwDOAMm_X844SU2R,16700639,2021-10-15T14:03:21Z,2021-10-15T14:03:21Z,CONTRIBUTOR,"I'll drop a PR, it might be easier to try and play with this than a piece of code lost in an issue.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,374025325 https://github.com/pydata/xarray/issues/2511#issuecomment-931430066,https://api.github.com/repos/pydata/xarray/issues/2511,931430066,IC_kwDOAMm_X843hH6y,16700639,2021-09-30T15:30:02Z,2021-10-06T09:48:19Z,CONTRIBUTOR,"Okay I could re do my test. If I manually call `compute()` before doing `isel(......)` my whole computation takes about **5.65 seconds**. However if I try with my naive patch it takes **32.34 seconds**. I'm sorry I cannot share as is my code, the relevant portion is really in the middle of many things. I'll try to get a minimalist version of it to share with you.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,374025325 https://github.com/pydata/xarray/issues/2511#issuecomment-930153816,https://api.github.com/repos/pydata/xarray/issues/2511,930153816,IC_kwDOAMm_X843cQVY,16700639,2021-09-29T13:02:15Z,2021-10-06T09:46:10Z,CONTRIBUTOR,"@pl-marasco Ok that's strange. I should have saved my use case :/ I will try to reproduce it and will provide a gist of it soon.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,374025325 https://github.com/pydata/xarray/issues/2511#issuecomment-932229595,https://api.github.com/repos/pydata/xarray/issues/2511,932229595,IC_kwDOAMm_X843kLHb,16700639,2021-10-01T13:29:32Z,2021-10-01T13:29:32Z,CONTRIBUTOR,"@pl-marasco Thanks for the example ! With it I have the same result as you, it takes the same time with patch or with compute. However, I could construct an example giving very different results. It is quite close to my original code: ``` time_start = time.perf_counter() COORDS = dict( time=pd.date_range(""2042-01-01"", periods=200, freq=pd.DateOffset(days=1)), ) da = xr.DataArray( np.random.rand(200 * 3500 * 350).reshape((200, 3500, 350)), dims=('time', 'x', 'y'), coords=COORDS ).chunk(dict(time=-1, x=100, y=100)) resampled = da.resample(time=""MS"") for label, sample in resampled: # sample = sample.compute() idx = sample.argmax('time') sample.isel(time=idx) time_elapsed = time.perf_counter() - time_start print(time_elapsed, "" secs"") ``` (Basically I want for each month the first event occurring in it). Without the patch and uncommenting `sample = sample.compute()`, it takes 5.7 secs. With the patch it takes 53.9 seconds. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,374025325 https://github.com/pydata/xarray/issues/2511#issuecomment-922942743,https://api.github.com/repos/pydata/xarray/issues/2511,922942743,IC_kwDOAMm_X843Av0X,16700639,2021-09-20T13:45:56Z,2021-09-20T13:45:56Z,CONTRIBUTOR,"I wrote a very naive fix, it works but seems to perform **really** slowly, I would appreciate some feedback (I'm a beginner with Dask). Basically, I added `k = dask.array.asarray(k, dtype=np.int64)` to do the exact same thing as with numpy. _I can create a PR if it's better to review this_ The patch: ``` class VectorizedIndexer(ExplicitIndexer): """"""Tuple for vectorized indexing. All elements should be slice or N-dimensional np.ndarray objects with an integer dtype and the same number of dimensions. Indexing follows proposed rules for np.ndarray.vindex, which matches NumPy's advanced indexing rules (including broadcasting) except sliced axes are always moved to the end: https://github.com/numpy/numpy/pull/6256 """""" __slots__ = () def __init__(self, key): if not isinstance(key, tuple): raise TypeError(f""key must be a tuple: {key!r}"") new_key = [] ndim = None for k in key: if isinstance(k, slice): k = as_integer_slice(k) elif isinstance(k, np.ndarray) or isinstance(k, dask.array.Array): if not np.issubdtype(k.dtype, np.integer): raise TypeError( f""invalid indexer array, does not have integer dtype: {k!r}"" ) if ndim is None: ndim = k.ndim elif ndim != k.ndim: ndims = [k.ndim for k in key if isinstance(k, np.ndarray)] raise ValueError( ""invalid indexer key: ndarray arguments "" f""have different numbers of dimensions: {ndims}"" ) if isinstance(k, dask.array.Array): k = dask.array.asarray(k, dtype=np.int64) else: k = np.asarray(k, dtype=np.int64) else: raise TypeError( f""unexpected indexer type for {type(self).__name__}: {k!r}"" ) new_key.append(k) super().__init__(new_key) ``` ","{""total_count"": 2, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 2, ""eyes"": 0}",,374025325