html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/5278#issuecomment-856340566,https://api.github.com/repos/pydata/xarray/issues/5278,856340566,MDEyOklzc3VlQ29tbWVudDg1NjM0MDU2Ng==,7441788,2021-06-08T00:02:48Z,2021-06-08T00:04:28Z,CONTRIBUTOR,"> > almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage). > > I think we'd be open to fixing this when it doesn't compromise readability. Can you open a new issue with some particularly bad examples? I wouldn't necessarily say that it's particularly bad, but see the discussion following https://github.com/pydata/xarray/pull/2922#issuecomment-601496897. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384 https://github.com/pydata/xarray/issues/5278#issuecomment-835597488,https://api.github.com/repos/pydata/xarray/issues/5278,835597488,MDEyOklzc3VlQ29tbWVudDgzNTU5NzQ4OA==,7441788,2021-05-09T00:40:08Z,2021-05-09T00:40:08Z,CONTRIBUTOR,"I'm not familiar at all with the various numpy interfaces, so I can't offer any input implementation-wise. But as a user, being able to do operations in place (via `out` or otherwise) is extremely useful when dealing with large arrays under memory constraints. In fact my one ""philosophical"" beef with xarray is that it seems almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384 https://github.com/pydata/xarray/issues/5278#issuecomment-834629871,https://api.github.com/repos/pydata/xarray/issues/5278,834629871,MDEyOklzc3VlQ29tbWVudDgzNDYyOTg3MQ==,7441788,2021-05-07T17:11:01Z,2021-05-07T17:11:14Z,CONTRIBUTOR,"> What is the case for having `out` kwargs? It lets you reuse memory you already have. In particular for a simple operation like clip, you can do it in-place: `da.clip(..., out=da.values)`. Very useful if you deal with lots of data and memory is a concern.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384 https://github.com/pydata/xarray/issues/5261#issuecomment-834415185,https://api.github.com/repos/pydata/xarray/issues/5261,834415185,MDEyOklzc3VlQ29tbWVudDgzNDQxNTE4NQ==,7441788,2021-05-07T13:51:46Z,2021-05-07T13:53:08Z,CONTRIBUTOR,"I'm wondering if one could just have a generic implementation of `DataArray.func(*args, **kwargs)` for any unrecognized ""`func`"" that calls `np.func(self, *args, **kwargs)` (perhaps conditional on some global `numpy_fallthrough` option)? And similarly for `Dataset.func(*args, **kwargs)`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,876394165 https://github.com/pydata/xarray/issues/4363#issuecomment-695165172,https://api.github.com/repos/pydata/xarray/issues/4363,695165172,MDEyOklzc3VlQ29tbWVudDY5NTE2NTE3Mg==,7441788,2020-09-19T05:00:50Z,2020-09-19T05:00:50Z,CONTRIBUTOR,"Indeed, this is reported in https://github.com/pandas-dev/pandas/issues/35466#issuecomment-678407125 and https://github.com/pandas-dev/pandas/issues/35830. Also https://github.com/pandas-dev/pandas/pull/35478.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,683657289 https://github.com/pydata/xarray/pull/4292#issuecomment-693117098,https://api.github.com/repos/pydata/xarray/issues/4292,693117098,MDEyOklzc3VlQ29tbWVudDY5MzExNzA5OA==,7441788,2020-09-16T01:34:08Z,2020-09-16T01:34:08Z,CONTRIBUTOR,Does this fix #4363?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,669307837 https://github.com/pydata/xarray/issues/4044#issuecomment-625530671,https://api.github.com/repos/pydata/xarray/issues/4044,625530671,MDEyOklzc3VlQ29tbWVudDYyNTUzMDY3MQ==,7441788,2020-05-07T22:33:46Z,2020-05-07T22:33:46Z,CONTRIBUTOR,"@TomNicholas , yes, thank you.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614149170 https://github.com/pydata/xarray/pull/2922#issuecomment-601885539,https://api.github.com/repos/pydata/xarray/issues/2922,601885539,MDEyOklzc3VlQ29tbWVudDYwMTg4NTUzOQ==,7441788,2020-03-20T19:57:54Z,2020-03-20T20:00:20Z,CONTRIBUTOR,"All good points: > What could be done, though is to only do da = da.fillna(0.0) if da contains NaNs. Good idea, though I don't know what the performance hit would be of the extra check (in the case that da does contain NaNs, so the check is for naught). > I assume so. I don't know what kind of temporary variables np.einsum creates. Also np.einsum is wrapped in xr.apply_ufunc so all kinds of magic is going on. Well, `(da * weights)` will be at least as large as `da`. I'm not certain, but I don't think np.einsum creates huge temporary arrays. > Do you want to leave it away for performance reasons? Because it was a deliberate decision to not support NaNs in the weights and I don't think this is going to change. Yes. You can continue not supporting NaNs in the weights, yet not explicitly check that there are no NaNs (optionally, if the caller assures you that there are no NaNs). > None of your suggested functions support NaNs so they won't work. Correct. These have nothing to do with the NaNs issue. For profiling memory usage, I use `psutil.Process(os.getpid()).memory_info().rss` for current usage and `resource.getusage(resource.RUSAGE_SElF).ru_maxrss` for peak usage (on linux).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416 https://github.com/pydata/xarray/pull/2922#issuecomment-601709733,https://api.github.com/repos/pydata/xarray/issues/2922,601709733,MDEyOklzc3VlQ29tbWVudDYwMTcwOTczMw==,7441788,2020-03-20T13:47:39Z,2020-03-20T16:31:14Z,CONTRIBUTOR,"@mathause, have you considered using these functions? - [np.average()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.average.html) to calculate weighted `mean()`. - [np.cov()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html) to calculate weighted `cov()`, `var()`, and `std()`. - [sp.stats.cumfreq()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cumfreq.html) to calculate weighted `median()` (I haven't thought this through). - [sp.spatial.distance.correlation()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.correlation.html) to calculate weighted `corrcoef()`. (Of course one could also calculate this from weighted `cov()` (see above), but first need to mask the two arrays simultaneously.) - [sklearn.utils.extmath.weighted_mode()](https://scikit-learn.org/stable/modules/generated/sklearn.utils.extmath.weighted_mode.html) to calculate weighted `mode()`. - [gmisclib.weighted_percentile.{wp,wtd_median}()](http://kochanski.org/gpk/code/speechresearch/gmisclib/gmisclib.weighted_percentile-module.html) to calculate weighted `quantile()` and `median()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416 https://github.com/pydata/xarray/pull/2922#issuecomment-601708110,https://api.github.com/repos/pydata/xarray/issues/2922,601708110,MDEyOklzc3VlQ29tbWVudDYwMTcwODExMA==,7441788,2020-03-20T13:44:03Z,2020-03-20T13:52:06Z,CONTRIBUTOR,"@mathause, ideally `dot()` would support `skipna`, so you could eliminate the `da = da.fillna(0.0)` and pass the `skipna` down the line. But alas it doesn't... `(da * weights).sum(dim=dim, skipna=skipna)` would likely make things worse, I think, as it would necessarily create a temporary array of sized at least `da`, no? Either way, this only addresses the `da = da.fillna(0.0)`, not the `mask = da.notnull()`. Also, perhaps the test `if weights.isnull().any()` in `Weighted.__init__()` should be optional? Maybe I'm more sensitive to this than others, but I regularly deal with 10-100GB arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416 https://github.com/pydata/xarray/pull/2922#issuecomment-601699091,https://api.github.com/repos/pydata/xarray/issues/2922,601699091,MDEyOklzc3VlQ29tbWVudDYwMTY5OTA5MQ==,7441788,2020-03-20T13:25:21Z,2020-03-20T13:25:21Z,CONTRIBUTOR,"@max-sixty, I wish I could, but I'm afraid that I cannot submit code due to employer limitations.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416 https://github.com/pydata/xarray/pull/2922#issuecomment-601496897,https://api.github.com/repos/pydata/xarray/issues/2922,601496897,MDEyOklzc3VlQ29tbWVudDYwMTQ5Njg5Nw==,7441788,2020-03-20T02:11:53Z,2020-03-20T02:12:24Z,CONTRIBUTOR,"I realize this is a bit late, but I'm still concerned about memory usage, specifically in https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L130 and https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L143. If `da.sizes = {'dim_0': 100000, 'dim_1': 100000}`, the two lines above will cause `da.weighted(weights).mean('dim_0')` to create two simultaneous temporary 100000x100000 arrays, which could be problematic. I would have implemented this using ``apply_ufunc``, so that one creates these temporary variables only on as small an array as absolutely necessary -- in this case just of size `sizes['dim_0'] = 100000`. (Much as I would like to, I'm afraid I'm not able to contribute code.) Of course this won't help in the case one is summing over all dimensions, but might as well minimize memory usage in some cases even if not in all.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416 https://github.com/pydata/xarray/issues/3829#issuecomment-594682466,https://api.github.com/repos/pydata/xarray/issues/3829,594682466,MDEyOklzc3VlQ29tbWVudDU5NDY4MjQ2Ng==,7441788,2020-03-04T17:27:09Z,2020-03-04T17:27:09Z,CONTRIBUTOR,"@keewis, thanks for the suggestions. Both seem reasonable. In your first example, if you wanted to prohibit `obj.weighted.sum(dim)`, you could just check for `self._weight` in `sum()`. Though I suppose it would be nice to be able to have the interpreter enforce the requirement and not have to do an explicit check in every method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,575564170 https://github.com/pydata/xarray/issues/3820#issuecomment-594128647,https://api.github.com/repos/pydata/xarray/issues/3820,594128647,MDEyOklzc3VlQ29tbWVudDU5NDEyODY0Nw==,7441788,2020-03-03T19:34:39Z,2020-03-03T19:34:39Z,CONTRIBUTOR,"Note that inferring dimensions from coords when it is a list of tuples does still work (with no deprecation warning): ``` In [1]: import numpy as np, xarray as xr In [2]: xr.DataArray(np.zeros((2, 2)), coords=[('x', [1, 2]), ('y', [1, 2])]) Out[2]: array([[0., 0.], [0., 0.]]) Coordinates: * x (x) int64 1 2 * y (y) int64 1 2 ```","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,574097799 https://github.com/pydata/xarray/issues/3810#issuecomment-592737661,https://api.github.com/repos/pydata/xarray/issues/3810,592737661,MDEyOklzc3VlQ29tbWVudDU5MjczNzY2MQ==,7441788,2020-02-28T21:29:58Z,2020-02-28T21:31:31Z,CONTRIBUTOR,"Note that with the `apply_ufunc` implementation we're only reshaping `dims`-sized `ndarray`s, not (necessarily) the whole DataArray, so maybe it's not too bad? It might be better to first sort `dims` to be in the same order as `self.dims`. i.e. `dims = [dim_ for dim_ in self.dims if dim_ in dims]`. But I'm just speculating.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480 https://github.com/pydata/xarray/issues/3810#issuecomment-592715925,https://api.github.com/repos/pydata/xarray/issues/3810,592715925,MDEyOklzc3VlQ29tbWVudDU5MjcxNTkyNQ==,7441788,2020-02-28T20:33:43Z,2020-02-28T20:35:57Z,CONTRIBUTOR,"A few minor tweaks needed: ``` In [20]: import bottleneck In [21]: xr.apply_ufunc( ...: lambda x: bottleneck.rankdata(x).reshape(x.shape), ...: d, ...: input_core_dims=[['xyz', 'abc']], ...: output_core_dims=[['xyz', 'abc']], ...: vectorize=True ...: ).transpose(*d.dims) Out[21]: array([[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.], [10., 11., 12.]]) Dimensions without coordinates: abc, xyz ``` Despite what the docs say, `bottleneck.{nan}rankdata(a)` returns a 1-dimensional ndarray, not an array with the same shape as `a`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480 https://github.com/pydata/xarray/issues/3810#issuecomment-592672463,https://api.github.com/repos/pydata/xarray/issues/3810,592672463,MDEyOklzc3VlQ29tbWVudDU5MjY3MjQ2Mw==,7441788,2020-02-28T18:51:18Z,2020-02-28T18:52:29Z,CONTRIBUTOR,"What's wrong with the following? (Still need to deal with `pct` and `keep_attrs`.) ```` apply_ufunc( bottleneck.{nan}rankdata, self, input_core_dims=[dims], output_core_dims=[dims], vectorize=True ) ```` Per https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata, ""The default (axis=None) is to rank the elements of the flattened array.""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480 https://github.com/pydata/xarray/issues/3810#issuecomment-592654794,https://api.github.com/repos/pydata/xarray/issues/3810,592654794,MDEyOklzc3VlQ29tbWVudDU5MjY1NDc5NA==,7441788,2020-02-28T18:06:57Z,2020-02-28T18:06:57Z,CONTRIBUTOR,"Assuming `dims` is a non-empty list of dimensions, the following code seems to work: ``` temp_dim = '__temp_dim__' return da.stack(**{temp_dim: dims}).\ rank(temp_dim, pct=pct, keep_attrs=keep_attrs).\ unstack(temp_dim).transpose(*da.dims).\ drop_vars([dim_ for dim_ in dims if dim_ not in da.coords]) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480 https://github.com/pydata/xarray/issues/2017#issuecomment-592151913,https://api.github.com/repos/pydata/xarray/issues/2017,592151913,MDEyOklzc3VlQ29tbWVudDU5MjE1MTkxMw==,7441788,2020-02-27T20:04:44Z,2020-02-27T20:04:44Z,CONTRIBUTOR,I'm afraid I'm not able to submit a PR. Sorry.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246 https://github.com/pydata/xarray/issues/2017#issuecomment-592033172,https://api.github.com/repos/pydata/xarray/issues/2017,592033172,MDEyOklzc3VlQ29tbWVudDU5MjAzMzE3Mg==,7441788,2020-02-27T15:55:23Z,2020-02-27T15:55:23Z,CONTRIBUTOR,"I think the only necessary changes are (a) delete the `if method != ""__call__""` check (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L49), and (b) in the `apply_ufunc()` call, replace `ufunc` with `getattr(ufunc, method)` (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L71).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246 https://github.com/pydata/xarray/issues/2017#issuecomment-592027630,https://api.github.com/repos/pydata/xarray/issues/2017,592027630,MDEyOklzc3VlQ29tbWVudDU5MjAyNzYzMA==,7441788,2020-02-27T15:38:05Z,2020-02-27T15:38:05Z,CONTRIBUTOR,This issue is still relevant.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246 https://github.com/pydata/xarray/issues/3736#issuecomment-582613810,https://api.github.com/repos/pydata/xarray/issues/3736,582613810,MDEyOklzc3VlQ29tbWVudDU4MjYxMzgxMA==,7441788,2020-02-05T21:09:43Z,2020-02-05T21:09:43Z,CONTRIBUTOR,This is fixed in Pandas 1.0.1.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,558204984 https://github.com/pydata/xarray/issues/3736#issuecomment-580897245,https://api.github.com/repos/pydata/xarray/issues/3736,580897245,MDEyOklzc3VlQ29tbWVudDU4MDg5NzI0NQ==,7441788,2020-01-31T20:24:30Z,2020-01-31T20:30:22Z,CONTRIBUTOR,Pandas bug: https://github.com/pandas-dev/pandas/issues/31501,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,558204984 https://github.com/pydata/xarray/issues/1635#issuecomment-545261919,https://api.github.com/repos/pydata/xarray/issues/1635,545261919,MDEyOklzc3VlQ29tbWVudDU0NTI2MTkxOQ==,7441788,2019-10-23T04:35:37Z,2019-10-23T04:35:37Z,CONTRIBUTOR,I think this issue is still relevant.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,266133430 https://github.com/pydata/xarray/issues/3236#issuecomment-523374500,https://api.github.com/repos/pydata/xarray/issues/3236,523374500,MDEyOklzc3VlQ29tbWVudDUyMzM3NDUwMA==,7441788,2019-08-21T09:22:35Z,2019-08-21T09:24:43Z,CONTRIBUTOR,"I was thinking a `tuple`/`list` (corresponding to `args`) of `dict`s (dim -> value) containing the non-input_core_dims being evaluated. (If it weren't for `exclude_dims`, if I understand it correctly, I think one would need only a single `dict` (dim -> value).) I would be fine with this being an optional kwarg to the actual `func`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,483028482 https://github.com/pydata/xarray/issues/1266#issuecomment-500217864,https://api.github.com/repos/pydata/xarray/issues/1266,500217864,MDEyOklzc3VlQ29tbWVudDUwMDIxNzg2NA==,7441788,2019-06-09T14:50:17Z,2019-06-09T14:50:17Z,CONTRIBUTOR,I think this is still an issue.,"{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,207317762 https://github.com/pydata/xarray/issues/1666#issuecomment-438749995,https://api.github.com/repos/pydata/xarray/issues/1666,438749995,MDEyOklzc3VlQ29tbWVudDQzODc0OTk5NQ==,7441788,2018-11-14T17:35:17Z,2018-11-14T17:38:31Z,CONTRIBUTOR,"Also, the following code seems to accomplish the same as the above: ``` def apply_func_rolling(func, *args, **kwargs): # determine rolling parameters, and remove them from kwargs apply_func_kwargs = {'input_core_dims', 'output_core_dims', 'vectorize', 'join', 'dataset_join', 'keep_attrs', 'exclude_dims', 'dataset_fill_value', 'kwargs', 'dask', 'output_dtypes', 'output_sizes'} min_periods = kwargs.pop('min_periods', None) center = kwargs.pop('center', False) dim = xr.core.utils.either_dict_or_kwargs(kwargs.pop('dim', None), {k: v for k, v in kwargs.items() if k not in apply_func_kwargs}, 'apply_func_rolling') if len(dim) != 1: raise ValueError(""precisely one rolling dimension must be specified"") rolling_dim = list(dim.keys())[0] kwargs.pop(rolling_dim) temp_rolling_dim = '__temp__{}__'.format(rolling_dim) # change input_core_dims rolling_dim values to temp_rolling_dim input_core_dims = kwargs.get('input_core_dims', None) if input_core_dims: kwargs['input_core_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_) for dim_ in dims_] for dims_ in input_core_dims] # change exclude_dims rolling_dim values to temp_rolling_dim exclude_dims = kwargs.get('exclude_dims', None) if exclude_dims: kwargs['exclude_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_) for dim_ in dims_] for dims_ in exclude_dims] # call apply_func() with rolling-constructed objects return xr.apply_ufunc(func, *[(arg.rolling(dim=dim, min_periods=min_periods, center=center). construct(temp_rolling_dim) if (rolling_dim in arg.dims) else arg) for arg in args], **kwargs) apply_func_rolling(lambda a, b, w: ..., variables, observations, weights, date=N, input_core_dims=[['date', 'dim1', 'dim2', 'var'], ['date', 'dim1', 'dim2'], ['date', 'dim1', 'dim2']], output_core_dims=[['var']], vectorize=True) ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1666#issuecomment-438372589,https://api.github.com/repos/pydata/xarray/issues/1666,438372589,MDEyOklzc3VlQ29tbWVudDQzODM3MjU4OQ==,7441788,2018-11-13T17:56:17Z,2018-11-13T20:16:57Z,CONTRIBUTOR,"> construct method does not allocate that large array in memory. It uses the strided trick and therefore consumers only the order of 1000x1000x1000. Ah. I didn't realize that. Good to know. What I'm actually looking to do is a rolling weighted regression. I have three DataArrays: - observations, dims=('date', 'dim1', 'dim2') - variables, dims=('date', 'dim1', 'dim2', 'var') - weights, dims=('date', 'dim1', 'dim2') I want to calculate a regression_coefficients DataArray with dims=('date', 'var'), where for each date it has the weighted regression coefficients calculated over the trailing N dates (over 'dim1' and 'dim2'). One way would be to put the three DataArrays in a Dataset, and then use a newly-defined `Dataset.rolling().apply()`. Another way would be to use an enhanced version of `apply_ufunc()` that can take `Rolling` objects. But now that I know that `DataArrayRolling.construct()` won't kill my machine, I'll try `apply_ufunc()` with the three `DataArrayRolling.construct()` objects. I'd welcome other suggestions. OK, I seem to have got my problem working using: ``` apply_ufunc(lambda a, b, w: ..., variables.rolling(date=N).construct('temp_date'), observations.rolling(date=N).construct('temp_date'), weights.rolling(date=N).construct('temp_date'), input_core_dims=[['temp_date', 'dim1', 'dim2', 'var'], ['temp_date', 'dim1', 'dim2'], ['temp_date', 'dim1', 'dim2']], output_core_dims=[['var']], vectorize=True) ``` Still, I wonder if there isn't a more ""natural"" way of accomplishing this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1666#issuecomment-438351586,https://api.github.com/repos/pydata/xarray/issues/1666,438351586,MDEyOklzc3VlQ29tbWVudDQzODM1MTU4Ng==,7441788,2018-11-13T17:06:38Z,2018-11-13T17:06:38Z,CONTRIBUTOR,"I think there are actually a couple different ways `Rolling.apply` could work, but this seems like one possible way: ``` from xarray.core.utils import maybe_wrap_array from xarray.core.combine import concat def rolling_apply(rolling, func, *args, **kwargs): applied = [maybe_wrap_array(label, func(arr, *args, **kwargs)) for label, arr in rolling] combined = concat(applied, dim=rolling.obj.coords[rolling.dim]) return combined ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1666#issuecomment-438345386,https://api.github.com/repos/pydata/xarray/issues/1666,438345386,MDEyOklzc3VlQ29tbWVudDQzODM0NTM4Ng==,7441788,2018-11-13T16:50:08Z,2018-11-13T16:52:13Z,CONTRIBUTOR,The problem I have with `Rolling.construct` is the same that I have with `Rolling.reduce`: it's very (potentially) memory-inefficient. E.g. consider a 1000x1000x1000 array for which I want to apply rolling window of length 500 along the final dimension; I believe `Rolling.construct/reduce` will construct a 1000x1000x1000x500 array. This can quickly get out of hand.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1666#issuecomment-438344608,https://api.github.com/repos/pydata/xarray/issues/1666,438344608,MDEyOklzc3VlQ29tbWVudDQzODM0NDYwOA==,7441788,2018-11-13T16:48:09Z,2018-11-13T16:48:09Z,CONTRIBUTOR,"Separately, maybe `apply_ufunc` should accepts `Rolling` objects?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1666#issuecomment-438322013,https://api.github.com/repos/pydata/xarray/issues/1666,438322013,MDEyOklzc3VlQ29tbWVudDQzODMyMjAxMw==,7441788,2018-11-13T16:06:24Z,2018-11-13T16:06:24Z,CONTRIBUTOR,I think what is needed are `DataArrayRolling.apply` and `DatasetRolling.apply` (like `DataArrayGroupBy.apply` and `DatasetGroupBy.apply`). The problem with the `reduce` methods is that they are memory-inefficient.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904 https://github.com/pydata/xarray/issues/1077#issuecomment-436015893,https://api.github.com/repos/pydata/xarray/issues/1077,436015893,MDEyOklzc3VlQ29tbWVudDQzNjAxNTg5Mw==,7441788,2018-11-05T20:03:48Z,2018-11-05T20:03:48Z,CONTRIBUTOR,"This code isn't particularly pretty, and I'm not sure if it handles all cases, but it enables serialization of MultiIndex indices by calling `ds.mi.encode_multiindices()` before serializing and `ds.mi.decode_multiindices()` after deserializing. ``` @xr.register_dataset_accessor('mi') class MiscDatasetAccessor(): def __init__(self, xarray_obj): self._obj = xarray_obj def encode_multiindices(self): result = self._obj for name, index in list(self._obj.indexes.items()): if isinstance(index, pd.MultiIndex): temp_name = '__' + name new_coords = {'{}__{}'.format(temp_name, level_name): level_values.rename(None) for level_name, level_values in zip(index.names, index.levels)} new_coords[temp_name] = xr.DataArray(index.labels, dims=('{}__names__'.format(temp_name), '{}__num__'.format(temp_name)), coords={'{}__names__'.format(temp_name): index.names, '{}__num__'.format(temp_name): list(range(len(index)))}, attrs={'__is_multiindex': 1}) result = result.drop(name).assign_coords(**new_coords) return result def decode_multiindices(self): result = self._obj for temp_name, da in list(self._obj.coords.items()): if temp_name.startswith('__') and da.attrs.get('__is_multiindex', False): name = temp_name[2:] level_names = da.coords['{}__names__'.format(temp_name)].values levels = [result.coords['{}__{}'.format(temp_name, level_name)].values for level_name in level_names] labels = da.values result = result.assign_coords(**{name: pd.MultiIndex(levels=levels, labels=labels, names=level_names)}) result = result.drop(['{}__{}'.format(temp_name, level_name) for level_name in level_names] + list(da.dims) + [temp_name]) return result ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187069161 https://github.com/pydata/xarray/issues/2170#issuecomment-407159487,https://api.github.com/repos/pydata/xarray/issues/2170,407159487,MDEyOklzc3VlQ29tbWVudDQwNzE1OTQ4Nw==,7441788,2018-07-23T18:39:52Z,2018-07-23T18:39:52Z,CONTRIBUTOR,"I second this request. The following may not be optimal, but seems to work for me as a `keepdims=True` version of `reduce()`: ``` def dim_preserving_reduce(self, func, dim=None, axis=None, label=None, keep_attrs=False, **kwargs): if axis is not None: dim = np.take(self._obj.dims, axis, mode='wrap') dims = dim if isinstance(dim, (list, tuple)) else [dim] dims_coords = {dim: [lab] for dim, lab in zip(dims, (label if isinstance(label, list) else [label]))} return self._obj.reduce(func, dim=dims, keep_attrs=keep_attrs, **kwargs). \ expand_dims(dims, axis=[self._obj.dims.index(dim) for dim in dims]). \ assign_coords(**dims_coords) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,325436508 https://github.com/pydata/xarray/pull/2293#issuecomment-406639293,https://api.github.com/repos/pydata/xarray/issues/2293,406639293,MDEyOklzc3VlQ29tbWVudDQwNjYzOTI5Mw==,7441788,2018-07-20T15:40:23Z,2018-07-20T15:40:23Z,CONTRIBUTOR,I added a note to whats-new.rst.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808 https://github.com/pydata/xarray/pull/2293#issuecomment-406485754,https://api.github.com/repos/pydata/xarray/issues/2293,406485754,MDEyOklzc3VlQ29tbWVudDQwNjQ4NTc1NA==,7441788,2018-07-20T04:27:51Z,2018-07-20T04:28:35Z,CONTRIBUTOR,"One slight oddity is that ``formatting.format_array_flat(np.arange(4), 0)`` returns ``0 ... 3`` even though ``0 1 2 3`` would take up the same number of characters. I'm not inclined to add a special case for this, but let me know if you think I should.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808 https://github.com/pydata/xarray/pull/2293#issuecomment-406390564,https://api.github.com/repos/pydata/xarray/issues/2293,406390564,MDEyOklzc3VlQ29tbWVudDQwNjM5MDU2NA==,7441788,2018-07-19T19:38:24Z,2018-07-19T19:38:24Z,CONTRIBUTOR,"@shoyer, I think I've implemented all your suggestions. Let me know what you think. (I haven't yet updated whats-new.rst.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808 https://github.com/pydata/xarray/issues/1186#issuecomment-405700114,https://api.github.com/repos/pydata/xarray/issues/1186,405700114,MDEyOklzc3VlQ29tbWVudDQwNTcwMDExNA==,7441788,2018-07-17T19:28:49Z,2018-07-17T19:28:49Z,CONTRIBUTOR,I included sample output in https://github.com/pydata/xarray/pull/2293#issuecomment-405369643 and https://github.com/pydata/xarray/pull/2293/files#diff-f82411dbe6aa53e3b6a5d9c2b601094c.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,197709208 https://github.com/pydata/xarray/pull/2285#issuecomment-405369880,https://api.github.com/repos/pydata/xarray/issues/2285,405369880,MDEyOklzc3VlQ29tbWVudDQwNTM2OTg4MA==,7441788,2018-07-16T20:23:50Z,2018-07-16T20:23:50Z,CONTRIBUTOR,Replaced with https://github.com/pydata/xarray/pull/2293.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341149017 https://github.com/pydata/xarray/pull/2293#issuecomment-405369643,https://api.github.com/repos/pydata/xarray/issues/2293,405369643,MDEyOklzc3VlQ29tbWVudDQwNTM2OTY0Mw==,7441788,2018-07-16T20:23:02Z,2018-07-16T20:23:02Z,CONTRIBUTOR,"Sample output: ``` (base) C:\Users\Seth\github\xarray>ipython Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import xarray as xr In [2]: words = ""This is the time for all good men to come to the aid of their country"".split(' ') In [3]: for i in range(0, len(words) + 1): ...: print(""-------------------------------------------------------------------------------"") ...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]})) ...: ------------------------------------------------------------------------------- array([], dtype=float64) Coordinates: * foo (foo) float64 ------------------------------------------------------------------------------- array(['This'], dtype=' array(['This', 'is'], dtype=' array(['This', 'is', 'the'], dtype=' array(['This', 'is', 'the', 'time'], dtype=' array(['This', 'is', 'the', 'time', 'for'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their', 'country'], dtype='ipython Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)] Type 'copyright', 'credits' or 'license' for more information IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help. In [1]: import xarray as xr In [2]: words = ""This is the time for all good men to come to the aid of their country"".split(' ') In [3]: for i in range(0, len(words) + 1): ...: print(""-------------------------------------------------------------------------------"") ...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]})) ...: ------------------------------------------------------------------------------- array([], dtype=float64) Coordinates: * foo (foo) float64 ------------------------------------------------------------------------------- array(['This'], dtype=' array(['This', 'is'], dtype=' array(['This', 'is', 'the'], dtype=' array(['This', 'is', 'the', 'time'], dtype=' array(['This', 'is', 'the', 'time', 'for'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their'], dtype=' array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come', 'to', 'the', 'aid', 'of', 'their', 'country'], dtype='