html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2525#issuecomment-447545224,https://api.github.com/repos/pydata/xarray/issues/2525,447545224,MDEyOklzc3VlQ29tbWVudDQ0NzU0NTIyNA==,6815844,2018-12-15T07:28:13Z,2018-12-15T07:28:13Z,MEMBER,"Thinking its API.
I like `rolling`-like API. One in my mind is
```python
ds.coarsen(x=2, y=2, side='left', trim_excess=True).mean()
```
To apply a customized callable other than `np.mean` to a particular coordinate, it would probably be
```python
ds.coarsen(x=2, y=2, side='left', trim_excess=True).mean(coordinate_apply={'surface_area': np.sum})
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-439766587,https://api.github.com/repos/pydata/xarray/issues/2525,439766587,MDEyOklzc3VlQ29tbWVudDQzOTc2NjU4Nw==,1197350,2018-11-19T04:13:37Z,2018-11-19T04:13:37Z,MEMBER,"> What would the coordinates look like?
>
> 1. apply `func` also for coordinate
> 2. always apply `mean` to coordinate
If I think about my applications, I would probably always want to apply `mean` to dimension coordinates, but would like to be able to choose for non-dimension coordinates.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-435272976,https://api.github.com/repos/pydata/xarray/issues/2525,435272976,MDEyOklzc3VlQ29tbWVudDQzNTI3Mjk3Ng==,2448579,2018-11-02T05:11:36Z,2018-11-02T05:11:36Z,MEMBER,"I like `coarsen` because it's a verb like resample, groupby.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-435268965,https://api.github.com/repos/pydata/xarray/issues/2525,435268965,MDEyOklzc3VlQ29tbWVudDQzNTI2ODk2NQ==,6815844,2018-11-02T04:37:35Z,2018-11-02T04:37:35Z,MEMBER,"+1 for `block`
What would the coordinates look like?
1. apply `func` also for coordinate
2. always apply `mean` to coordinate
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-435213658,https://api.github.com/repos/pydata/xarray/issues/2525,435213658,MDEyOklzc3VlQ29tbWVudDQzNTIxMzY1OA==,1217238,2018-11-01T22:51:55Z,2018-11-01T22:51:55Z,MEMBER,"skimage implements `block_reduce` via the `view_as_blocks` utility function: https://github.com/scikit-image/scikit-image/blob/62e29cd89dc858d8fb9d3578034a2f456f298ed3/skimage/util/shape.py#L9-L103
But given that it doesn't actually duplicate any elements and needs a C-order array to work, I think it's actually just equivalent to use use `reshape` + `transpose`, e.g., `B = A.reshape(4, 1, 2, 2, 3, 2).transpose([0, 2, 4, 1, 3, 5])` reproduces `skimage.util.view_as_blocks(A, (1, 2, 2))` from the docstring example.
So the super-simple version of block-reduce looks like:
```python
def block_reduce(image, block_size, func=np.sum):
# TODO: input validation
# TODO: consider copying padding from skimage
blocked_shape = []
for existing_size, block_size in zip(image.shape, block_size):
blocked_shape.extend([existing_size // block_size, block_size])
blocked = np.reshape(image, tuple(blocked_shape))
return func(blocked, axis=tuple(range(1, blocked.ndim, 2)))
```
This would work on dask arrays out of the box but it's probably worth benchmarking whether you'd get better performance doing the operation chunk-wise (e.g., with `map_blocks`).","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-435192382,https://api.github.com/repos/pydata/xarray/issues/2525,435192382,MDEyOklzc3VlQ29tbWVudDQzNTE5MjM4Mg==,1217238,2018-11-01T21:24:15Z,2018-11-01T21:24:15Z,MEMBER,"OK, so maybe `da.block({'lat': 2, 'lon': 2}).mean()` would be a good way to spell this, if that's not too confusing with `.chunk()`? Other possible method names: `groupby_block`, `blocked`?
We could call this something like `coarsen()` or `block_reduce()` with a `how='mean'` or maybe `func=mean` argument, but I like the consistency with resample/rolling/groupby.
We can save the full coordinate based version for a later addition to `.resample()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434705757,https://api.github.com/repos/pydata/xarray/issues/2525,434705757,MDEyOklzc3VlQ29tbWVudDQzNDcwNTc1Nw==,1217238,2018-10-31T14:22:07Z,2018-10-31T14:22:07Z,MEMBER,"block_reduce from skimage is indeed a small function using strides/reshape,
if I remember correctly. We should certainly copy or implement it ourselves
rather than adding an skimage dependency.
On Wed, Oct 31, 2018 at 12:36 AM Keisuke Fujii
wrote:
> block_reduce sounds nice, but I am a little hesitating to add a
> soft-dependence of scikit-image only for this function...
> It is using the strid trick, as we are doing in rolling.construct. Maybe
> we can implement it by ourselves.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434589377,https://api.github.com/repos/pydata/xarray/issues/2525,434589377,MDEyOklzc3VlQ29tbWVudDQzNDU4OTM3Nw==,6815844,2018-10-31T07:36:41Z,2018-10-31T07:36:41Z,MEMBER,"`block_reduce` sounds nice, but I am a little hesitating to add a soft-dependence of scikit-image only for this function...
It is using the strid trick, as we are doing in `rolling.construct`. Maybe we can implement it by ourselves.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434480457,https://api.github.com/repos/pydata/xarray/issues/2525,434480457,MDEyOklzc3VlQ29tbWVudDQzNDQ4MDQ1Nw==,1197350,2018-10-30T21:41:17Z,2018-10-30T21:41:25Z,MEMBER,"> I would lean towards a coordinate based representation since it's a little more usable/certain to be correct.
I feel that this could become too complex in the case of irregularly spaced coordinates. I slightly favor the index-based approach (as in my function above), which one calls like
```python
aggregate_da(da, {'lat': 2, 'lon': 2})
```
If we do that, we can just use scikit-image's `block_reduce` function, which is vectorized and works great with `apply_ufunc`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434477550,https://api.github.com/repos/pydata/xarray/issues/2525,434477550,MDEyOklzc3VlQ29tbWVudDQzNDQ3NzU1MA==,1217238,2018-10-30T21:31:18Z,2018-10-30T21:31:18Z,MEMBER,"I'm +1 for adding this feature in some form as well.
From an API perspective, should the window size be specified in term of integer or coordinates?
- `rolling` is integer based
- `resample` is coordinate based
I would lean towards a coordinate based representation since it's a little more usable/certain to be correct. It might even make sense to still call this `resample`, though obviously the time options would no longer apply. Also, we would almost certainly want a faster underlying implementation than what we currently use for `resample()`.
The API for resampling to a 2x2 degree latitude/longittude grid could look something like: `da.resample(lat=2, lon=2).mean()`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434294356,https://api.github.com/repos/pydata/xarray/issues/2525,434294356,MDEyOklzc3VlQ29tbWVudDQzNDI5NDM1Ng==,1197350,2018-10-30T13:10:16Z,2018-10-30T13:10:39Z,MEMBER,"FYI, I do this often in my work with this sort of function:
```python
import xarray as xr
from skimage.measure import block_reduce
def aggregate_da(da, agg_dims, suf='_agg'):
input_core_dims = list(agg_dims)
n_agg = len(input_core_dims)
core_block_size = tuple([agg_dims[k] for k in input_core_dims])
block_size = (da.ndim - n_agg)*(1,) + core_block_size
output_core_dims = [dim + suf for dim in input_core_dims]
output_sizes = {(dim + suf): da.shape[da.get_axis_num(dim)]//agg_dims[dim]
for dim in input_core_dims}
output_dtypes = da.dtype
da_out = xr.apply_ufunc(block_reduce, da, kwargs={'block_size': block_size},
input_core_dims=[input_core_dims],
output_core_dims=[output_core_dims],
output_sizes=output_sizes,
output_dtypes=[output_dtypes],
dask='parallelized')
for dim in input_core_dims:
new_coord = block_reduce(da[dim].data, (agg_dims[dim],), func=np.mean)
da_out.coords[dim + suf] = (dim + suf, new_coord)
return da_out
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434294114,https://api.github.com/repos/pydata/xarray/issues/2525,434294114,MDEyOklzc3VlQ29tbWVudDQzNDI5NDExNA==,1197350,2018-10-30T13:09:25Z,2018-10-30T13:09:25Z,MEMBER,"This is being discussed in #1192 under a different name.
Yes, we need this feature.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758
https://github.com/pydata/xarray/issues/2525#issuecomment-434261896,https://api.github.com/repos/pydata/xarray/issues/2525,434261896,MDEyOklzc3VlQ29tbWVudDQzNDI2MTg5Ng==,6815844,2018-10-30T11:17:17Z,2018-10-30T11:17:17Z,MEMBER,"This is from a [thread at SO](https://stackoverflow.com/questions/52886703/xarray-multidimensional-binning-array-reduction-on-sample-dataset-of-4-x4-to/52981916?noredirect=1#comment93001872_52981916).
Does anyone have an opinion if we add a `bin` (or `rolling_bin`) method to compute the binning?
For the above example, currently we need to do
```python
dsa.rolling(x=2).construct('tmp').isel(x=slice(1, None, 2)).mean('tmp')
```
which is a little complex.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,375126758