home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 242950070

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/988#issuecomment-242950070 https://api.github.com/repos/pydata/xarray/issues/988 242950070 MDEyOklzc3VlQ29tbWVudDI0Mjk1MDA3MA== 1217238 2016-08-28T01:15:52Z 2016-08-28T01:16:13Z MEMBER

Let me give concrete examples of what this interface could look like.

To implement units:

``` python from typing import List, Optional # optional Python 3.5 type annotations

@xarray.register_ufunc_variables_attrs_handler def propagate_units(results: List[xarray.Variable], context: xarray.UFuncContext) -> Optional[List[dict]]: if context.func.name in ['add', 'sub']: units_set = set(getattr(arg, 'attrs', {}).get('units') for arg in context.args) if len(units_set) > 1: raise ValueError('not all input units the same: %r' % units_set) units, = units_set return [{'units': units}] else: return [] * len(results) # or equivalently, don't return anything at all ```

Or to (partially) handle cell_methods:

python @xarray.register_ufunc_variables_attrs_handler def add_cell_methods(results, context): if context.func.__name__ in ['mean', 'median', 'sum', 'min', 'max', 'std']): dims = set(context.args[0].dims) - set(results[0].dims) cell_methods = ': '.join(dims) + ': ' + context.func.__name__ return [{'cell_methods': cell_methods})

Or to implement keep_attrs=True if a function only has one input:

python @xarray.register_ufunc_variables_attrs_handler def always_keep_attrs(results, context): if len(context.args) == 1: return [context.args[0].attrs] * len(result)

Every time xarray does an operation, we would call all of these registered ufunc_variables_attrs_handlers to get list of attributes to add to result Variable. attrs on the resulting object (or objects if the ufunc has multiple outputs) would be accumulated by calling the handlers in arbitrary order and merging the resulting dicts. Repeated keys in the attrs dicts returned by different handlers would result in an error.

xarray.UFuncContext itself would be a simple struct-like class with a few attributes: - func: the function being applied. Typically from NumPy or dask.array, but also could be an arbitrarily callable if a user calls xarray.apply_ufunc directly. - args: positional arguments passed into the function. Possibly xarray Variable objects, numpy.ndarray or scalars. - kwargs: additional dict of keyword arguments.

Similarly, we would have register_ufunc_dataset_attrs_handler for updating Dataset attrs.

The downside of this approach is that unlike the way NumPy handles things, this doesn't handle conflicting implementations well. If you try to use two different libraries that register their own global attribute handlers instead of using the context manager (e.g., two different units implementations), things will break, even if the unrelated code paths do not touch.

So alternatively to using the registration system, we could support/encourage using a context manager, e.g.,

python with xarray.ufunc_variables_attrs_handlers([always_keep_attrs, add_cell_methods]): # either augment or ignore other attrs handlers, possibly depending # on the choice of a keyword argument to ufunc_variables_attrs_handlers result = ds.mean()

It's kind of verbose, but certainly useful for libraries that want to be cautious about breaking other code. In general, it's poor behavior for libraries to unilaterally change unrelated code without an explicit opt-in. So perhaps the best approach is to encourage users to always use a context manager, e.g.,

``` python import contextlib

@contextlib.contextmanager def my_attrs_context(): with xarray.ufunc_variables_attrs_handlers( [always_keep_attrs, add_cell_methods, ...]): yield

with my_attrs_context(): result = ds.mean() - 0.5 * (ds.max() - ds.min()) ```

So maybe a subclass based implementation (with a custom attribute like __xarray_attrs_handler__) is the cleanest way to handle this, after all.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  173612265
Powered by Datasette · Queries took 0.984ms · About: xarray-datasette