html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1192#issuecomment-433510805,https://api.github.com/repos/pydata/xarray/issues/1192,433510805,MDEyOklzc3VlQ29tbWVudDQzMzUxMDgwNQ==,14314623,2018-10-26T18:59:07Z,2018-10-26T18:59:07Z,CONTRIBUTOR,"I should add that I would be happy to work on an implementation, but probably need a good amount of pointers. Here is the implementation that I have been using (only works with dask.arrays at this point). Should have posted that earlier to avoid @rabernat s zingers over here. ```python def aggregate(da, blocks, func=np.nanmean, debug=False): """""" Performs efficient block averaging in one or multiple dimensions. Only works on regular grid dimensions. Parameters ---------- da : xarray DataArray (must be a dask array!) blocks : list List of tuples containing the dimension and interval to aggregate over func : function Aggregation function.Defaults to numpy.nanmean Returns ------- da_agg : xarray Data Aggregated array Examples -------- >>> from xarrayutils import aggregate >>> import numpy as np >>> import xarray as xr >>> import matplotlib.pyplot as plt >>> %matplotlib inline >>> import dask.array as da >>> x = np.arange(-10,10) >>> y = np.arange(-10,10) >>> xx,yy = np.meshgrid(x,y) >>> z = xx**2-yy**2 >>> a = xr.DataArray(da.from_array(z, chunks=(20, 20)), coords={'x':x,'y':y}, dims=['y','x']) >>> print a dask.array Coordinates: * y (y) int64 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 * x (x) int64 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 >>> blocks = [('x',2),('y',5)] >>> a_coarse = aggregate(a,blocks,func=np.mean) >>> print a_coarse dask.array Coordinates: * y (y) int64 -10 0 * x (x) int64 -10 -8 -6 -4 -2 0 2 4 6 8 Attributes: Coarsened with: Coarsenblocks: [('x', 2), ('y', 10)] """""" # Check if the input is a dask array (I might want to convert this # automaticlaly in the future) if not isinstance(da.data, Array): raise RuntimeError('data array data must be a dask array') # Check data type of blocks # TODO write test if (not all(isinstance(n[0], str) for n in blocks) or not all(isinstance(n[1], int) for n in blocks)): print('blocks input', str(blocks)) raise RuntimeError(""block dimension must be dtype(str), \ e.g. ('lon',4)"") # Check if the given array has the dimension specified in blocks try: block_dict = dict((da.get_axis_num(x), y) for x, y in blocks) except ValueError: raise RuntimeError(""'blocks' contains non matching dimension"") # Check the size of the excess in each aggregated axis blocks = [(a[0], a[1], da.shape[da.get_axis_num(a[0])] % a[1]) for a in blocks] # for now default to trimming the excess da_coarse = coarsen(func, da.data, block_dict, trim_excess=True) # for now default to only the dims new_coords = dict([]) # for cc in da.coords.keys(): warnings.warn(""WARNING: only dimensions are carried over as coordinates"") for cc in list(da.dims): new_coords[cc] = da.coords[cc] for dd in blocks: if dd[0] in list(da.coords[cc].dims): new_coords[cc] = \ new_coords[cc].isel( **{dd[0]: slice(0, -(1 + dd[2]), dd[1])}) attrs = {'Coarsened with': str(func), 'Coarsenblocks': str(blocks)} da_coarse = xr.DataArray(da_coarse, dims=da.dims, coords=new_coords, name=da.name, attrs=attrs) return da_coarse ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089 https://github.com/pydata/xarray/issues/1192#issuecomment-433160023,https://api.github.com/repos/pydata/xarray/issues/1192,433160023,MDEyOklzc3VlQ29tbWVudDQzMzE2MDAyMw==,14314623,2018-10-25T18:35:57Z,2018-10-25T18:35:57Z,CONTRIBUTOR,"Is this feature still being considered? A big +1 from me. I wrote my own function to achieve this (using dask.array.coarsen), but I was planning to implement a similar functionality in [xgcm](https://github.com/xgcm/xgcm/issues/103), and it would be ideal if we could use an upstream implementation from xarray. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089 https://github.com/pydata/xarray/issues/1192#issuecomment-305176003,https://api.github.com/repos/pydata/xarray/issues/1192,305176003,MDEyOklzc3VlQ29tbWVudDMwNTE3NjAwMw==,3217406,2017-05-31T12:45:18Z,2017-05-31T12:45:18Z,CONTRIBUTOR,"The reason I ask is that, ideally, ``coarsen`` would work exactly the same with ``dask.array`` and ``np.ndarray`` data. By using both serial and parallel coarsen methods from ``dask``, we are adding a dependency but we are ensuring forward compatibility. @shoyer, what's your preference? (1) replicate serial coarsen into ``xarray`` or (2) point to ``dask`` coarsen methods?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089 https://github.com/pydata/xarray/issues/1192#issuecomment-305169201,https://api.github.com/repos/pydata/xarray/issues/1192,305169201,MDEyOklzc3VlQ29tbWVudDMwNTE2OTIwMQ==,3217406,2017-05-31T12:00:11Z,2017-05-31T12:00:11Z,CONTRIBUTOR,If it's part of ``dask`` then it would be almost trivial to implement in ``xarray``. @mrocklin Can we assume that ``dask/array/chunk.py::coarsen`` is part of the public API?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089 https://github.com/pydata/xarray/issues/1192#issuecomment-270439515,https://api.github.com/repos/pydata/xarray/issues/1192,270439515,MDEyOklzc3VlQ29tbWVudDI3MDQzOTUxNQ==,3217406,2017-01-04T17:59:08Z,2017-01-04T17:59:08Z,CONTRIBUTOR,"The ``dask`` implementation has the following API: ``dask.array.coarsen(reduction, x, axes, trim_excess=False)`` so a proposed ``xarray`` API could look like: ``xarray.coarsen(reduction, x, axes, chunks=None, trim_excess=False)``, resulting in the following implementation: 1. If the underlying data to ``x`` is ``dask.array``, yields x.chunks(chunks).array.coarsen(reduction, axes, trim_excess) 2. Else, copy the ``block_reduce`` function. Does that fit with the ``xarray`` API?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,198742089