home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 403367810

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
403367810 MDU6SXNzdWU0MDMzNjc4MTA= 2713 xarray.DataArray.mean() can't calculate weighted mean 10720577 closed 0     2 2019-01-25T23:08:01Z 2019-01-26T02:50:07Z 2019-01-26T02:49:53Z CONTRIBUTOR      

Code Sample, a copy-pastable example if possible

Currently xarray.DataArray.mean() and xarray.Dataset.mean() cannot calculate weighted means. I think it would be useful if it had a similar API to numpy.average: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.average.html

Here is the code I currently use to get the weighted mean of an xarray.DataArray. ```python def weighted_mean(data_da, dim, weights): r"""Computes the weighted mean.

We can only do the actual weighted mean over the dimensions that
``data_da`` and ``weights`` share, so for dimensions in ``dim`` that aren't
included in ``weights`` we must take the unweighted mean.

This functions skips NaNs, i.e. Data points that are NaN have corresponding
NaN weights.

Args:
    data_da (xarray.DataArray):
        Data to compute a weighted mean for.
    dim (str | list[str]):
        dimension(s) of the dataarray to reduce over
    weights (xarray.DataArray):
        a 1-D dataarray the same length as the weighted dim, with dimension
        name equal to that of the weighted dim. Must be nonnegative.
Returns:
    (xarray.DataArray):
        The mean over the given dimension. So it will contain all
        dimensions of the input that are not in ``dim``.
Raises:
    (IndexError):
        If ``weights.dims`` is not a subset of ``dim``.
    (ValueError):
        If ``weights`` has values that are negative or infinite.
"""
if isinstance(dim, str):
    dim = [dim]
else:
    dim = list(dim)

if not set(weights.dims) <= set(dim):
    dim_err_msg = (
        "`weights.dims` must be a subset of `dim`. {} are dimensions in "
        "`weights`, but not in `dim`."
    ).format(set(weights.dims) - set(dim))
    raise IndexError(dim_err_msg)
else:
    pass  # `weights.dims` is a subset of `dim`

if (weights < 0).any() or xr.ufuncs.isinf(weights).any():
    negative_weight_err_msg = "Weight must be nonnegative and finite"
    raise ValueError(negative_weight_err_msg)
else:
    pass  # `weights` are nonnegative

weight_dims = [
    weight_dim for weight_dim in dim if weight_dim in weights.dims
]

if np.isnan(data_da).any():
    expanded_weights, _ = xr.broadcast(weights, data_da)
    weights_with_nans = expanded_weights.where(~np.isnan(data_da))
else:
    weights_with_nans = weights

mean_da = ((data_da * weights_with_nans).sum(weight_dims, skipna=True)
           / weights_with_nans.sum(weight_dims))
other_dims = list(set(dim) - set(weight_dims))
return mean_da.mean(other_dims, skipna=True)

``` If no one is already working on this and if it seems useful, then I would be happy to work on this.

Thank you, Martin

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2713/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 156.096ms · About: xarray-datasette