html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5278#issuecomment-856340566,https://api.github.com/repos/pydata/xarray/issues/5278,856340566,MDEyOklzc3VlQ29tbWVudDg1NjM0MDU2Ng==,7441788,2021-06-08T00:02:48Z,2021-06-08T00:04:28Z,CONTRIBUTOR,"> > almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage).
>
> I think we'd be open to fixing this when it doesn't compromise readability. Can you open a new issue with some particularly bad examples?
I wouldn't necessarily say that it's particularly bad, but see the discussion following https://github.com/pydata/xarray/pull/2922#issuecomment-601496897.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384
https://github.com/pydata/xarray/issues/5278#issuecomment-835597488,https://api.github.com/repos/pydata/xarray/issues/5278,835597488,MDEyOklzc3VlQ29tbWVudDgzNTU5NzQ4OA==,7441788,2021-05-09T00:40:08Z,2021-05-09T00:40:08Z,CONTRIBUTOR,"I'm not familiar at all with the various numpy interfaces, so I can't offer any input implementation-wise. But as a user, being able to do operations in place (via `out` or otherwise) is extremely useful when dealing with large arrays under memory constraints. In fact my one ""philosophical"" beef with xarray is that it seems almost no attention is paid to minimizing memory consumption (whether through in-place operations, or more generally minimizing temporary memory usage).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384
https://github.com/pydata/xarray/issues/5278#issuecomment-834629871,https://api.github.com/repos/pydata/xarray/issues/5278,834629871,MDEyOklzc3VlQ29tbWVudDgzNDYyOTg3MQ==,7441788,2021-05-07T17:11:01Z,2021-05-07T17:11:14Z,CONTRIBUTOR,"> What is the case for having `out` kwargs?
It lets you reuse memory you already have. In particular for a simple operation like clip, you can do it in-place: `da.clip(..., out=da.values)`. Very useful if you deal with lots of data and memory is a concern.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,879033384
https://github.com/pydata/xarray/issues/5261#issuecomment-834415185,https://api.github.com/repos/pydata/xarray/issues/5261,834415185,MDEyOklzc3VlQ29tbWVudDgzNDQxNTE4NQ==,7441788,2021-05-07T13:51:46Z,2021-05-07T13:53:08Z,CONTRIBUTOR,"I'm wondering if one could just have a generic implementation of `DataArray.func(*args, **kwargs)` for any unrecognized ""`func`"" that calls `np.func(self, *args, **kwargs)` (perhaps conditional on some global `numpy_fallthrough` option)? And similarly for `Dataset.func(*args, **kwargs)`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,876394165
https://github.com/pydata/xarray/issues/4363#issuecomment-695165172,https://api.github.com/repos/pydata/xarray/issues/4363,695165172,MDEyOklzc3VlQ29tbWVudDY5NTE2NTE3Mg==,7441788,2020-09-19T05:00:50Z,2020-09-19T05:00:50Z,CONTRIBUTOR,"Indeed, this is reported in https://github.com/pandas-dev/pandas/issues/35466#issuecomment-678407125 and https://github.com/pandas-dev/pandas/issues/35830. Also https://github.com/pandas-dev/pandas/pull/35478.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,683657289
https://github.com/pydata/xarray/pull/4292#issuecomment-693117098,https://api.github.com/repos/pydata/xarray/issues/4292,693117098,MDEyOklzc3VlQ29tbWVudDY5MzExNzA5OA==,7441788,2020-09-16T01:34:08Z,2020-09-16T01:34:08Z,CONTRIBUTOR,Does this fix #4363?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,669307837
https://github.com/pydata/xarray/issues/4044#issuecomment-625530671,https://api.github.com/repos/pydata/xarray/issues/4044,625530671,MDEyOklzc3VlQ29tbWVudDYyNTUzMDY3MQ==,7441788,2020-05-07T22:33:46Z,2020-05-07T22:33:46Z,CONTRIBUTOR,"@TomNicholas , yes, thank you.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,614149170
https://github.com/pydata/xarray/pull/2922#issuecomment-601885539,https://api.github.com/repos/pydata/xarray/issues/2922,601885539,MDEyOklzc3VlQ29tbWVudDYwMTg4NTUzOQ==,7441788,2020-03-20T19:57:54Z,2020-03-20T20:00:20Z,CONTRIBUTOR,"All good points:
> What could be done, though is to only do da = da.fillna(0.0) if da contains NaNs.
Good idea, though I don't know what the performance hit would be of the extra check (in the case that da does contain NaNs, so the check is for naught).
> I assume so. I don't know what kind of temporary variables np.einsum creates. Also np.einsum is wrapped in xr.apply_ufunc so all kinds of magic is going on.
Well, `(da * weights)` will be at least as large as `da`. I'm not certain, but I don't think np.einsum creates huge temporary arrays.
> Do you want to leave it away for performance reasons? Because it was a deliberate decision to not support NaNs in the weights and I don't think this is going to change.
Yes. You can continue not supporting NaNs in the weights, yet not explicitly check that there are no NaNs (optionally, if the caller assures you that there are no NaNs).
> None of your suggested functions support NaNs so they won't work.
Correct. These have nothing to do with the NaNs issue.
For profiling memory usage, I use `psutil.Process(os.getpid()).memory_info().rss` for current usage and `resource.getusage(resource.RUSAGE_SElF).ru_maxrss` for peak usage (on linux).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416
https://github.com/pydata/xarray/pull/2922#issuecomment-601709733,https://api.github.com/repos/pydata/xarray/issues/2922,601709733,MDEyOklzc3VlQ29tbWVudDYwMTcwOTczMw==,7441788,2020-03-20T13:47:39Z,2020-03-20T16:31:14Z,CONTRIBUTOR,"@mathause, have you considered using these functions?
- [np.average()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.average.html) to calculate weighted `mean()`.
- [np.cov()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html) to calculate weighted `cov()`, `var()`, and `std()`.
- [sp.stats.cumfreq()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.cumfreq.html) to calculate weighted `median()` (I haven't thought this through).
- [sp.spatial.distance.correlation()](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.correlation.html) to calculate weighted `corrcoef()`. (Of course one could also calculate this from weighted `cov()` (see above), but first need to mask the two arrays simultaneously.)
- [sklearn.utils.extmath.weighted_mode()](https://scikit-learn.org/stable/modules/generated/sklearn.utils.extmath.weighted_mode.html) to calculate weighted `mode()`.
- [gmisclib.weighted_percentile.{wp,wtd_median}()](http://kochanski.org/gpk/code/speechresearch/gmisclib/gmisclib.weighted_percentile-module.html) to calculate weighted `quantile()` and `median()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416
https://github.com/pydata/xarray/pull/2922#issuecomment-601708110,https://api.github.com/repos/pydata/xarray/issues/2922,601708110,MDEyOklzc3VlQ29tbWVudDYwMTcwODExMA==,7441788,2020-03-20T13:44:03Z,2020-03-20T13:52:06Z,CONTRIBUTOR,"@mathause, ideally `dot()` would support `skipna`, so you could eliminate the `da = da.fillna(0.0)` and pass the `skipna` down the line. But alas it doesn't...
`(da * weights).sum(dim=dim, skipna=skipna)` would likely make things worse, I think, as it would necessarily create a temporary array of sized at least `da`, no?
Either way, this only addresses the `da = da.fillna(0.0)`, not the `mask = da.notnull()`.
Also, perhaps the test `if weights.isnull().any()` in `Weighted.__init__()` should be optional?
Maybe I'm more sensitive to this than others, but I regularly deal with 10-100GB arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416
https://github.com/pydata/xarray/pull/2922#issuecomment-601699091,https://api.github.com/repos/pydata/xarray/issues/2922,601699091,MDEyOklzc3VlQ29tbWVudDYwMTY5OTA5MQ==,7441788,2020-03-20T13:25:21Z,2020-03-20T13:25:21Z,CONTRIBUTOR,"@max-sixty, I wish I could, but I'm afraid that I cannot submit code due to employer limitations.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416
https://github.com/pydata/xarray/pull/2922#issuecomment-601496897,https://api.github.com/repos/pydata/xarray/issues/2922,601496897,MDEyOklzc3VlQ29tbWVudDYwMTQ5Njg5Nw==,7441788,2020-03-20T02:11:53Z,2020-03-20T02:12:24Z,CONTRIBUTOR,"I realize this is a bit late, but I'm still concerned about memory usage, specifically in https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L130 and https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L143.
If `da.sizes = {'dim_0': 100000, 'dim_1': 100000}`, the two lines above will cause `da.weighted(weights).mean('dim_0')` to create two simultaneous temporary 100000x100000 arrays, which could be problematic.
I would have implemented this using ``apply_ufunc``, so that one creates these temporary variables only on as small an array as absolutely necessary -- in this case just of size `sizes['dim_0'] = 100000`. (Much as I would like to, I'm afraid I'm not able to contribute code.) Of course this won't help in the case one is summing over all dimensions, but might as well minimize memory usage in some cases even if not in all.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,437765416
https://github.com/pydata/xarray/issues/3829#issuecomment-594682466,https://api.github.com/repos/pydata/xarray/issues/3829,594682466,MDEyOklzc3VlQ29tbWVudDU5NDY4MjQ2Ng==,7441788,2020-03-04T17:27:09Z,2020-03-04T17:27:09Z,CONTRIBUTOR,"@keewis, thanks for the suggestions. Both seem reasonable.
In your first example, if you wanted to prohibit `obj.weighted.sum(dim)`, you could just check for `self._weight` in `sum()`. Though I suppose it would be nice to be able to have the interpreter enforce the requirement and not have to do an explicit check in every method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,575564170
https://github.com/pydata/xarray/issues/3820#issuecomment-594128647,https://api.github.com/repos/pydata/xarray/issues/3820,594128647,MDEyOklzc3VlQ29tbWVudDU5NDEyODY0Nw==,7441788,2020-03-03T19:34:39Z,2020-03-03T19:34:39Z,CONTRIBUTOR,"Note that inferring dimensions from coords when it is a list of tuples does still work (with no deprecation warning):
```
In [1]: import numpy as np, xarray as xr
In [2]: xr.DataArray(np.zeros((2, 2)), coords=[('x', [1, 2]), ('y', [1, 2])])
Out[2]:
array([[0., 0.],
[0., 0.]])
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2
```","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,574097799
https://github.com/pydata/xarray/issues/3810#issuecomment-592737661,https://api.github.com/repos/pydata/xarray/issues/3810,592737661,MDEyOklzc3VlQ29tbWVudDU5MjczNzY2MQ==,7441788,2020-02-28T21:29:58Z,2020-02-28T21:31:31Z,CONTRIBUTOR,"Note that with the `apply_ufunc` implementation we're only reshaping `dims`-sized `ndarray`s, not (necessarily) the whole DataArray, so maybe it's not too bad? It might be better to first sort `dims` to be in the same order as `self.dims`. i.e. `dims = [dim_ for dim_ in self.dims if dim_ in dims]`. But I'm just speculating.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592715925,https://api.github.com/repos/pydata/xarray/issues/3810,592715925,MDEyOklzc3VlQ29tbWVudDU5MjcxNTkyNQ==,7441788,2020-02-28T20:33:43Z,2020-02-28T20:35:57Z,CONTRIBUTOR,"A few minor tweaks needed:
```
In [20]: import bottleneck
In [21]: xr.apply_ufunc(
...: lambda x: bottleneck.rankdata(x).reshape(x.shape),
...: d,
...: input_core_dims=[['xyz', 'abc']],
...: output_core_dims=[['xyz', 'abc']],
...: vectorize=True
...: ).transpose(*d.dims)
Out[21]:
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.],
[10., 11., 12.]])
Dimensions without coordinates: abc, xyz
```
Despite what the docs say, `bottleneck.{nan}rankdata(a)` returns a 1-dimensional ndarray, not an array with the same shape as `a`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592672463,https://api.github.com/repos/pydata/xarray/issues/3810,592672463,MDEyOklzc3VlQ29tbWVudDU5MjY3MjQ2Mw==,7441788,2020-02-28T18:51:18Z,2020-02-28T18:52:29Z,CONTRIBUTOR,"What's wrong with the following? (Still need to deal with `pct` and `keep_attrs`.)
````
apply_ufunc(
bottleneck.{nan}rankdata,
self,
input_core_dims=[dims],
output_core_dims=[dims],
vectorize=True
)
````
Per https://kwgoodman.github.io/bottleneck-doc/reference.html#bottleneck.rankdata, ""The default (axis=None) is to rank the elements of the flattened array.""","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/3810#issuecomment-592654794,https://api.github.com/repos/pydata/xarray/issues/3810,592654794,MDEyOklzc3VlQ29tbWVudDU5MjY1NDc5NA==,7441788,2020-02-28T18:06:57Z,2020-02-28T18:06:57Z,CONTRIBUTOR,"Assuming `dims` is a non-empty list of dimensions, the following code seems to work:
```
temp_dim = '__temp_dim__'
return da.stack(**{temp_dim: dims}).\
rank(temp_dim, pct=pct, keep_attrs=keep_attrs).\
unstack(temp_dim).transpose(*da.dims).\
drop_vars([dim_ for dim_ in dims if dim_ not in da.coords])
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,572875480
https://github.com/pydata/xarray/issues/2017#issuecomment-592151913,https://api.github.com/repos/pydata/xarray/issues/2017,592151913,MDEyOklzc3VlQ29tbWVudDU5MjE1MTkxMw==,7441788,2020-02-27T20:04:44Z,2020-02-27T20:04:44Z,CONTRIBUTOR,I'm afraid I'm not able to submit a PR. Sorry.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246
https://github.com/pydata/xarray/issues/2017#issuecomment-592033172,https://api.github.com/repos/pydata/xarray/issues/2017,592033172,MDEyOklzc3VlQ29tbWVudDU5MjAzMzE3Mg==,7441788,2020-02-27T15:55:23Z,2020-02-27T15:55:23Z,CONTRIBUTOR,"I think the only necessary changes are (a) delete the `if method != ""__call__""` check (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L49), and (b) in the `apply_ufunc()` call, replace `ufunc` with `getattr(ufunc, method)` (https://github.com/pydata/xarray/blob/master/xarray/core/arithmetic.py#L71).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246
https://github.com/pydata/xarray/issues/2017#issuecomment-592027630,https://api.github.com/repos/pydata/xarray/issues/2017,592027630,MDEyOklzc3VlQ29tbWVudDU5MjAyNzYzMA==,7441788,2020-02-27T15:38:05Z,2020-02-27T15:38:05Z,CONTRIBUTOR,This issue is still relevant.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,309098246
https://github.com/pydata/xarray/issues/3736#issuecomment-582613810,https://api.github.com/repos/pydata/xarray/issues/3736,582613810,MDEyOklzc3VlQ29tbWVudDU4MjYxMzgxMA==,7441788,2020-02-05T21:09:43Z,2020-02-05T21:09:43Z,CONTRIBUTOR,This is fixed in Pandas 1.0.1.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,558204984
https://github.com/pydata/xarray/issues/3736#issuecomment-580897245,https://api.github.com/repos/pydata/xarray/issues/3736,580897245,MDEyOklzc3VlQ29tbWVudDU4MDg5NzI0NQ==,7441788,2020-01-31T20:24:30Z,2020-01-31T20:30:22Z,CONTRIBUTOR,Pandas bug: https://github.com/pandas-dev/pandas/issues/31501,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,558204984
https://github.com/pydata/xarray/issues/1635#issuecomment-545261919,https://api.github.com/repos/pydata/xarray/issues/1635,545261919,MDEyOklzc3VlQ29tbWVudDU0NTI2MTkxOQ==,7441788,2019-10-23T04:35:37Z,2019-10-23T04:35:37Z,CONTRIBUTOR,I think this issue is still relevant.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,266133430
https://github.com/pydata/xarray/issues/3236#issuecomment-523374500,https://api.github.com/repos/pydata/xarray/issues/3236,523374500,MDEyOklzc3VlQ29tbWVudDUyMzM3NDUwMA==,7441788,2019-08-21T09:22:35Z,2019-08-21T09:24:43Z,CONTRIBUTOR,"I was thinking a `tuple`/`list` (corresponding to `args`) of `dict`s (dim -> value) containing the non-input_core_dims being evaluated. (If it weren't for `exclude_dims`, if I understand it correctly, I think one would need only a single `dict` (dim -> value).)
I would be fine with this being an optional kwarg to the actual `func`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,483028482
https://github.com/pydata/xarray/issues/1266#issuecomment-500217864,https://api.github.com/repos/pydata/xarray/issues/1266,500217864,MDEyOklzc3VlQ29tbWVudDUwMDIxNzg2NA==,7441788,2019-06-09T14:50:17Z,2019-06-09T14:50:17Z,CONTRIBUTOR,I think this is still an issue.,"{""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,207317762
https://github.com/pydata/xarray/issues/1666#issuecomment-438749995,https://api.github.com/repos/pydata/xarray/issues/1666,438749995,MDEyOklzc3VlQ29tbWVudDQzODc0OTk5NQ==,7441788,2018-11-14T17:35:17Z,2018-11-14T17:38:31Z,CONTRIBUTOR,"Also, the following code seems to accomplish the same as the above:
```
def apply_func_rolling(func, *args, **kwargs):
# determine rolling parameters, and remove them from kwargs
apply_func_kwargs = {'input_core_dims', 'output_core_dims', 'vectorize', 'join', 'dataset_join',
'keep_attrs', 'exclude_dims', 'dataset_fill_value', 'kwargs', 'dask',
'output_dtypes', 'output_sizes'}
min_periods = kwargs.pop('min_periods', None)
center = kwargs.pop('center', False)
dim = xr.core.utils.either_dict_or_kwargs(kwargs.pop('dim', None),
{k: v for k, v in kwargs.items()
if k not in apply_func_kwargs},
'apply_func_rolling')
if len(dim) != 1:
raise ValueError(""precisely one rolling dimension must be specified"")
rolling_dim = list(dim.keys())[0]
kwargs.pop(rolling_dim)
temp_rolling_dim = '__temp__{}__'.format(rolling_dim)
# change input_core_dims rolling_dim values to temp_rolling_dim
input_core_dims = kwargs.get('input_core_dims', None)
if input_core_dims:
kwargs['input_core_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_)
for dim_ in dims_] for dims_ in input_core_dims]
# change exclude_dims rolling_dim values to temp_rolling_dim
exclude_dims = kwargs.get('exclude_dims', None)
if exclude_dims:
kwargs['exclude_dims'] = [[(temp_rolling_dim if (dim_ == rolling_dim) else dim_)
for dim_ in dims_] for dims_ in exclude_dims]
# call apply_func() with rolling-constructed objects
return xr.apply_ufunc(func,
*[(arg.rolling(dim=dim, min_periods=min_periods, center=center).
construct(temp_rolling_dim) if (rolling_dim in arg.dims) else arg)
for arg in args],
**kwargs)
apply_func_rolling(lambda a, b, w: ...,
variables, observations, weights,
date=N,
input_core_dims=[['date', 'dim1', 'dim2', 'var'],
['date', 'dim1', 'dim2'],
['date', 'dim1', 'dim2']],
output_core_dims=[['var']],
vectorize=True)
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1666#issuecomment-438372589,https://api.github.com/repos/pydata/xarray/issues/1666,438372589,MDEyOklzc3VlQ29tbWVudDQzODM3MjU4OQ==,7441788,2018-11-13T17:56:17Z,2018-11-13T20:16:57Z,CONTRIBUTOR,"> construct method does not allocate that large array in memory. It uses the strided trick and therefore consumers only the order of 1000x1000x1000.
Ah. I didn't realize that. Good to know.
What I'm actually looking to do is a rolling weighted regression. I have three DataArrays:
- observations, dims=('date', 'dim1', 'dim2')
- variables, dims=('date', 'dim1', 'dim2', 'var')
- weights, dims=('date', 'dim1', 'dim2')
I want to calculate a regression_coefficients DataArray with dims=('date', 'var'), where for each date it has the weighted regression coefficients calculated over the trailing N dates (over 'dim1' and 'dim2'). One way would be to put the three DataArrays in a Dataset, and then use a newly-defined `Dataset.rolling().apply()`. Another way would be to use an enhanced version of `apply_ufunc()` that can take `Rolling` objects. But now that I know that `DataArrayRolling.construct()` won't kill my machine, I'll try `apply_ufunc()` with the three `DataArrayRolling.construct()` objects. I'd welcome other suggestions.
OK, I seem to have got my problem working using:
```
apply_ufunc(lambda a, b, w: ...,
variables.rolling(date=N).construct('temp_date'),
observations.rolling(date=N).construct('temp_date'),
weights.rolling(date=N).construct('temp_date'),
input_core_dims=[['temp_date', 'dim1', 'dim2', 'var'],
['temp_date', 'dim1', 'dim2'],
['temp_date', 'dim1', 'dim2']],
output_core_dims=[['var']],
vectorize=True)
```
Still, I wonder if there isn't a more ""natural"" way of accomplishing this.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1666#issuecomment-438351586,https://api.github.com/repos/pydata/xarray/issues/1666,438351586,MDEyOklzc3VlQ29tbWVudDQzODM1MTU4Ng==,7441788,2018-11-13T17:06:38Z,2018-11-13T17:06:38Z,CONTRIBUTOR,"I think there are actually a couple different ways `Rolling.apply` could work, but this seems like one possible way:
```
from xarray.core.utils import maybe_wrap_array
from xarray.core.combine import concat
def rolling_apply(rolling, func, *args, **kwargs):
applied = [maybe_wrap_array(label, func(arr, *args, **kwargs)) for label, arr in rolling]
combined = concat(applied, dim=rolling.obj.coords[rolling.dim])
return combined
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1666#issuecomment-438345386,https://api.github.com/repos/pydata/xarray/issues/1666,438345386,MDEyOklzc3VlQ29tbWVudDQzODM0NTM4Ng==,7441788,2018-11-13T16:50:08Z,2018-11-13T16:52:13Z,CONTRIBUTOR,The problem I have with `Rolling.construct` is the same that I have with `Rolling.reduce`: it's very (potentially) memory-inefficient. E.g. consider a 1000x1000x1000 array for which I want to apply rolling window of length 500 along the final dimension; I believe `Rolling.construct/reduce` will construct a 1000x1000x1000x500 array. This can quickly get out of hand.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1666#issuecomment-438344608,https://api.github.com/repos/pydata/xarray/issues/1666,438344608,MDEyOklzc3VlQ29tbWVudDQzODM0NDYwOA==,7441788,2018-11-13T16:48:09Z,2018-11-13T16:48:09Z,CONTRIBUTOR,"Separately, maybe `apply_ufunc` should accepts `Rolling` objects?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1666#issuecomment-438322013,https://api.github.com/repos/pydata/xarray/issues/1666,438322013,MDEyOklzc3VlQ29tbWVudDQzODMyMjAxMw==,7441788,2018-11-13T16:06:24Z,2018-11-13T16:06:24Z,CONTRIBUTOR,I think what is needed are `DataArrayRolling.apply` and `DatasetRolling.apply` (like `DataArrayGroupBy.apply` and `DatasetGroupBy.apply`). The problem with the `reduce` methods is that they are memory-inefficient.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,269297904
https://github.com/pydata/xarray/issues/1077#issuecomment-436015893,https://api.github.com/repos/pydata/xarray/issues/1077,436015893,MDEyOklzc3VlQ29tbWVudDQzNjAxNTg5Mw==,7441788,2018-11-05T20:03:48Z,2018-11-05T20:03:48Z,CONTRIBUTOR,"This code isn't particularly pretty, and I'm not sure if it handles all cases, but it enables serialization of MultiIndex indices by calling `ds.mi.encode_multiindices()` before serializing and `ds.mi.decode_multiindices()` after deserializing.
```
@xr.register_dataset_accessor('mi')
class MiscDatasetAccessor():
def __init__(self, xarray_obj):
self._obj = xarray_obj
def encode_multiindices(self):
result = self._obj
for name, index in list(self._obj.indexes.items()):
if isinstance(index, pd.MultiIndex):
temp_name = '__' + name
new_coords = {'{}__{}'.format(temp_name, level_name): level_values.rename(None)
for level_name, level_values in zip(index.names, index.levels)}
new_coords[temp_name] = xr.DataArray(index.labels,
dims=('{}__names__'.format(temp_name),
'{}__num__'.format(temp_name)),
coords={'{}__names__'.format(temp_name): index.names,
'{}__num__'.format(temp_name): list(range(len(index)))},
attrs={'__is_multiindex': 1})
result = result.drop(name).assign_coords(**new_coords)
return result
def decode_multiindices(self):
result = self._obj
for temp_name, da in list(self._obj.coords.items()):
if temp_name.startswith('__') and da.attrs.get('__is_multiindex', False):
name = temp_name[2:]
level_names = da.coords['{}__names__'.format(temp_name)].values
levels = [result.coords['{}__{}'.format(temp_name, level_name)].values for level_name in level_names]
labels = da.values
result = result.assign_coords(**{name: pd.MultiIndex(levels=levels, labels=labels, names=level_names)})
result = result.drop(['{}__{}'.format(temp_name, level_name) for level_name in level_names] +
list(da.dims) + [temp_name])
return result
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187069161
https://github.com/pydata/xarray/issues/2170#issuecomment-407159487,https://api.github.com/repos/pydata/xarray/issues/2170,407159487,MDEyOklzc3VlQ29tbWVudDQwNzE1OTQ4Nw==,7441788,2018-07-23T18:39:52Z,2018-07-23T18:39:52Z,CONTRIBUTOR,"I second this request.
The following may not be optimal, but seems to work for me as a `keepdims=True` version of `reduce()`:
```
def dim_preserving_reduce(self, func, dim=None, axis=None, label=None, keep_attrs=False, **kwargs):
if axis is not None:
dim = np.take(self._obj.dims, axis, mode='wrap')
dims = dim if isinstance(dim, (list, tuple)) else [dim]
dims_coords = {dim: [lab] for dim, lab in zip(dims, (label if isinstance(label, list) else [label]))}
return self._obj.reduce(func, dim=dims, keep_attrs=keep_attrs, **kwargs). \
expand_dims(dims, axis=[self._obj.dims.index(dim) for dim in dims]). \
assign_coords(**dims_coords)
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,325436508
https://github.com/pydata/xarray/pull/2293#issuecomment-406639293,https://api.github.com/repos/pydata/xarray/issues/2293,406639293,MDEyOklzc3VlQ29tbWVudDQwNjYzOTI5Mw==,7441788,2018-07-20T15:40:23Z,2018-07-20T15:40:23Z,CONTRIBUTOR,I added a note to whats-new.rst.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808
https://github.com/pydata/xarray/pull/2293#issuecomment-406485754,https://api.github.com/repos/pydata/xarray/issues/2293,406485754,MDEyOklzc3VlQ29tbWVudDQwNjQ4NTc1NA==,7441788,2018-07-20T04:27:51Z,2018-07-20T04:28:35Z,CONTRIBUTOR,"One slight oddity is that ``formatting.format_array_flat(np.arange(4), 0)`` returns ``0 ... 3`` even though ``0 1 2 3`` would take up the same number of characters. I'm not inclined to add a special case for this, but let me know if you think I should.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808
https://github.com/pydata/xarray/pull/2293#issuecomment-406390564,https://api.github.com/repos/pydata/xarray/issues/2293,406390564,MDEyOklzc3VlQ29tbWVudDQwNjM5MDU2NA==,7441788,2018-07-19T19:38:24Z,2018-07-19T19:38:24Z,CONTRIBUTOR,"@shoyer, I think I've implemented all your suggestions. Let me know what you think. (I haven't yet updated whats-new.rst.)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341664808
https://github.com/pydata/xarray/issues/1186#issuecomment-405700114,https://api.github.com/repos/pydata/xarray/issues/1186,405700114,MDEyOklzc3VlQ29tbWVudDQwNTcwMDExNA==,7441788,2018-07-17T19:28:49Z,2018-07-17T19:28:49Z,CONTRIBUTOR,I included sample output in https://github.com/pydata/xarray/pull/2293#issuecomment-405369643 and https://github.com/pydata/xarray/pull/2293/files#diff-f82411dbe6aa53e3b6a5d9c2b601094c.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,197709208
https://github.com/pydata/xarray/pull/2285#issuecomment-405369880,https://api.github.com/repos/pydata/xarray/issues/2285,405369880,MDEyOklzc3VlQ29tbWVudDQwNTM2OTg4MA==,7441788,2018-07-16T20:23:50Z,2018-07-16T20:23:50Z,CONTRIBUTOR,Replaced with https://github.com/pydata/xarray/pull/2293.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,341149017
https://github.com/pydata/xarray/pull/2293#issuecomment-405369643,https://api.github.com/repos/pydata/xarray/issues/2293,405369643,MDEyOklzc3VlQ29tbWVudDQwNTM2OTY0Mw==,7441788,2018-07-16T20:23:02Z,2018-07-16T20:23:02Z,CONTRIBUTOR,"Sample output:
```
(base) C:\Users\Seth\github\xarray>ipython
Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import xarray as xr
In [2]: words = ""This is the time for all good men to come to the aid of their country"".split(' ')
In [3]: for i in range(0, len(words) + 1):
...: print(""-------------------------------------------------------------------------------"")
...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]}))
...:
-------------------------------------------------------------------------------
array([], dtype=float64)
Coordinates:
* foo (foo) float64
-------------------------------------------------------------------------------
array(['This'], dtype='
array(['This', 'is'], dtype='
array(['This', 'is', 'the'], dtype='
array(['This', 'is', 'the', 'time'], dtype='
array(['This', 'is', 'the', 'time', 'for'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'],
dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'],
dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of', 'their'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of', 'their', 'country'], dtype='ipython
Python 3.6.5 | packaged by conda-forge | (default, Apr 6 2018, 16:13:55) [MSC v.1900 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 6.4.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import xarray as xr
In [2]: words = ""This is the time for all good men to come to the aid of their country"".split(' ')
In [3]: for i in range(0, len(words) + 1):
...: print(""-------------------------------------------------------------------------------"")
...: print(xr.DataArray(words[:i], dims=('foo',), coords={'foo': words[:i]}))
...:
-------------------------------------------------------------------------------
array([], dtype=float64)
Coordinates:
* foo (foo) float64
-------------------------------------------------------------------------------
array(['This'], dtype='
array(['This', 'is'], dtype='
array(['This', 'is', 'the'], dtype='
array(['This', 'is', 'the', 'time'], dtype='
array(['This', 'is', 'the', 'time', 'for'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to'],
dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come'],
dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of', 'their'], dtype='
array(['This', 'is', 'the', 'time', 'for', 'all', 'good', 'men', 'to', 'come',
'to', 'the', 'aid', 'of', 'their', 'country'], dtype='