home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 134921284

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
134921284 MDU6SXNzdWUxMzQ5MjEyODQ= 770 Internal refactor: create a generic function for applying ufuncs-like functions to xarray objects 1217238 closed 0     4 2016-02-19T17:18:53Z 2017-10-20T16:44:47Z 2017-10-20T16:44:47Z MEMBER      

It would be awesome to have a generic function for making functions that act like NumPy's generalized universal functions "xarray aware".

What would xarray.apply_ufunc(func, objs, join='inner', agg_dims=None, drop_dims=None, kwargs=None) do? 1. If one or more of the provided objects are Dataset or GroupBy instances, dispatch to specialized loops that call the remainder of apply_ufunc repeatedly. 2. align all objects along shared labels using the indicated join (for some operations, e.g., where, a left join is appropriate rather than an inner join). 3. broadcast all objects against each other to expand dimensionality along all dimensions except (optionally) those listed in agg_dims/drop_dims. drop_dims should be moved to the end, for consistency with gufunc signatures. 4. Transform agg_dims (if provided) into an axis argument using get_axis_num and insert it into kwargs. 5. Apply func to the data argument of each array to calculate the result using the provided kwargs. The result is expected to have all the same dimensions in the provided arrays, except any listed in the dims and drop_dims arguments. 6. merge all coordinate data together (i.e., with an n-ary version of the Coordinate.merge method) and add these to the result array.

If any of args are not xarray objects (e.g., they're NumPy or dask arrays), they should be skipped in operations that don't apply to them. xarray.Variable don't align or have coordinates, for example.

A concrete example of similar functionality in dask.array is atop. The most similar thing to this that we currently have in xarray are the _unary_op and _binary_op staticmethods (e.g., on DataArray), but these only handle one or two arguments, don't handle aggregated dimensions and most importantly, are difficult to apply to new operations.

Here are a few concrete examples of how this could work:

``` python def average(array, weights, dim=None): # still needs a bit of work to make a NaN and dask.array safe version # version of np.average return apply_ufunc(np.average, [array, weights], agg_dims=dim)

def where(cond, first, second=None): if second is None: # need to write where2, a function that looks at first.dtype # to infer the appropriate NA sentinel value return apply_ufunc(ops.where2, [cond, first]) else: return apply_ufunc(ops.where, [cond, first, second])

def dot(self, other, dim=None): if dim is None: dim = set(self.dims) ^ set(other.dims) return apply_ufunc(ops.tensordot, [self, other], agg_dims=dim) ```

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/770/reactions",
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 85.211ms · About: xarray-datasette