github: issues: 2 rows where repo = 13221727, type = "issue" and user = 10720577 sorted by updated

2 rows where repo = 13221727, type = "issue" and user = 10720577 sorted by updated_at descending

Search:

descending

✎ View and edit SQL

This data as json, CSV (advanced)

node_id

number

title

user

state

locked

assignee

milestone

comments

created_at

updated_at ▲

closed_at

author_association

active_lock_reason

draft

pull_request

body

reactions

performed_via_github_app

state_reason

repo

type

403326458

MDU6SXNzdWU0MDMzMjY0NTg=

2710

xarray.DataArray.expand_dims() can only expand dimension for a point coordinate

pletchm 10720577

closed

2019-01-25T20:46:05Z

2020-02-20T15:35:22Z

CONTRIBUTOR

Current `expand_dims` functionality

Apparently, expand_dims can only create a dimension for a point coordinate, i.e. it promotes a scalar coordinate into 1D coordinate. Here is an example: ```python

coords = {"b": range(5), "c": range(3)} da = xr.DataArray(np.ones([5, 3]), coords=coords, dims=list(coords.keys())) da <xarray.DataArray (b: 5, c: 3)> array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]) Coordinates: * b (b) int64 0 1 2 3 4 * c (c) int64 0 1 2 da["a"] = 0 # create a point coordinate da <xarray.DataArray (b: 5, c: 3)> array([[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]) Coordinates: * b (b) int64 0 1 2 3 4 * c (c) int64 0 1 2 a int64 0 da.expand_dims("a") # create a new dimension "a" for the point coordinated <xarray.DataArray (a: 1, b: 5, c: 3)> array([[[1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.], [1., 1., 1.]]]) Coordinates: * b (b) int64 0 1 2 3 4 * c (c) int64 0 1 2 * a (a) int64 0

```

Problem description

I want to be able to do 2 more things with expand_dims or maybe a related/similar method: 1) broadcast the data across 1 or more new dimensions 2) expand an existing dimension to include 1 or more new coordinates

Here is the code I currently use to accomplish this

``` from collections import OrderedDict

import xarray as xr

def expand_dimensions(data, fill_value=np.nan, **new_coords): """Expand (or add if it doesn't yet exist) the data array to fill in new coordinates across multiple dimensions.

If a dimension doesn't exist in the dataarray yet, then the result will be
`data`, broadcasted across this dimension.

>>> da = xr.DataArray([1, 2, 3], dims="a", coords=[[0, 1, 2]])
>>> expand_dimensions(da, b=[1, 2, 3, 4, 5])
<xarray.DataArray (a: 3, b: 5)>
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 2.,  2.,  2.,  2.,  2.],
       [ 3.,  3.,  3.,  3.,  3.]])
Coordinates:
  * a        (a) int64 0 1 2
  * b        (b) int64 1 2 3 4 5

Or, if `dim` is already a dimension in `data`, then any new coordinate
values in `new_coords` that are not yet in `data[dim]` will be added,
and the values corresponding to those new coordinates will be `fill_value`.

>>> da = xr.DataArray([1, 2, 3], dims="a", coords=[[0, 1, 2]])
>>> expand_dimensions(da, a=[1, 2, 3, 4, 5])
<xarray.DataArray (a: 6)>
array([ 1.,  2.,  3.,  0.,  0.,  0.])
Coordinates:
  * a        (a) int64 0 1 2 3 4 5

Args:
    data (xarray.DataArray):
        Data that needs dimensions expanded.
    fill_value (scalar, xarray.DataArray, optional):
        If expanding new coords this is the value of the new datum.
        Defaults to `np.nan`.
    **new_coords (list[int | str]):
        The keywords are arbitrary dimensions and the values are
        coordinates of those dimensions that the data will include after it
        has been expanded.
Returns:
    xarray.DataArray:
        Data that had its dimensions expanded to include the new
        coordinates.
"""
ordered_coord_dict = OrderedDict(new_coords)
shape_da = xr.DataArray(
    np.zeros(list(map(len, ordered_coord_dict.values()))),
    coords=ordered_coord_dict,
    dims=ordered_coord_dict.keys())
expanded_data = xr.broadcast(data, shape_da)[0].fillna(fill_value)
return expanded_data

Here's an example of broadcasting data across a new dimension:

coords = {"b": range(5), "c": range(3)} da = xr.DataArray(np.ones([5, 3]), coords=coords, dims=list(coords.keys())) expand_dimensions(da, a=[0, 1, 2]) <xarray.DataArray (b: 5, c: 3, a: 3)> array([[[1., 1., 1.], [1., 1., 1.], [1., 1., 1.]],

   [[1., 1., 1.],
    [1., 1., 1.],
    [1., 1., 1.]],

   [[1., 1., 1.],
    [1., 1., 1.],
    [1., 1., 1.]],

   [[1., 1., 1.],
    [1., 1., 1.],
    [1., 1., 1.]],

   [[1., 1., 1.],
    [1., 1., 1.],
    [1., 1., 1.]]])

Coordinates: * b (b) int64 0 1 2 3 4 * c (c) int64 0 1 2 * a (a) int64 0 1 2 Here's an example of expanding an existing dimension to include new coordinates:

expand_dimensions(da, b=[5, 6]) <xarray.DataArray (b: 7, c: 3)> array([[ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [ 1., 1., 1.], [nan, nan, nan], [nan, nan, nan]]) Coordinates: * b (b) int64 0 1 2 3 4 5 6 * c (c) int64 0 1 2 ```

Final Note

If no one else is already working on this, and if it seems like a useful addition to XArray, then I would more than happy to work on this. Please let me know.

Thank you, Martin

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2710/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}

completed

xarray 13221727

issue

403367810

MDU6SXNzdWU0MDMzNjc4MTA=

2713

xarray.DataArray.mean() can't calculate weighted mean

pletchm 10720577

closed

2019-01-25T23:08:01Z

2019-01-26T02:50:07Z

2019-01-26T02:49:53Z

CONTRIBUTOR

Code Sample, a copy-pastable example if possible

Currently xarray.DataArray.mean() and xarray.Dataset.mean() cannot calculate weighted means. I think it would be useful if it had a similar API to numpy.average: https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.average.html

Here is the code I currently use to get the weighted mean of an xarray.DataArray. ```python def weighted_mean(data_da, dim, weights): r"""Computes the weighted mean.

We can only do the actual weighted mean over the dimensions that
``data_da`` and ``weights`` share, so for dimensions in ``dim`` that aren't
included in ``weights`` we must take the unweighted mean.

This functions skips NaNs, i.e. Data points that are NaN have corresponding
NaN weights.

Args:
    data_da (xarray.DataArray):
        Data to compute a weighted mean for.
    dim (str | list[str]):
        dimension(s) of the dataarray to reduce over
    weights (xarray.DataArray):
        a 1-D dataarray the same length as the weighted dim, with dimension
        name equal to that of the weighted dim. Must be nonnegative.
Returns:
    (xarray.DataArray):
        The mean over the given dimension. So it will contain all
        dimensions of the input that are not in ``dim``.
Raises:
    (IndexError):
        If ``weights.dims`` is not a subset of ``dim``.
    (ValueError):
        If ``weights`` has values that are negative or infinite.
"""
if isinstance(dim, str):
    dim = [dim]
else:
    dim = list(dim)

if not set(weights.dims) <= set(dim):
    dim_err_msg = (
        "`weights.dims` must be a subset of `dim`. {} are dimensions in "
        "`weights`, but not in `dim`."
    ).format(set(weights.dims) - set(dim))
    raise IndexError(dim_err_msg)
else:
    pass  # `weights.dims` is a subset of `dim`

if (weights < 0).any() or xr.ufuncs.isinf(weights).any():
    negative_weight_err_msg = "Weight must be nonnegative and finite"
    raise ValueError(negative_weight_err_msg)
else:
    pass  # `weights` are nonnegative

weight_dims = [
    weight_dim for weight_dim in dim if weight_dim in weights.dims
]

if np.isnan(data_da).any():
    expanded_weights, _ = xr.broadcast(weights, data_da)
    weights_with_nans = expanded_weights.where(~np.isnan(data_da))
else:
    weights_with_nans = weights

mean_da = ((data_da * weights_with_nans).sum(weight_dims, skipna=True)
           / weights_with_nans.sum(weight_dims))
other_dims = list(set(dim) - set(weight_dims))
return mean_da.mean(other_dims, skipna=True)

``` If no one is already working on this and if it seems useful, then I would be happy to work on this.