home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 368066239

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1938#issuecomment-368066239 https://api.github.com/repos/pydata/xarray/issues/1938 368066239 MDEyOklzc3VlQ29tbWVudDM2ODA2NjIzOQ== 1217238 2018-02-23T16:47:53Z 2018-02-23T16:47:53Z MEMBER

Do we need to be capable of supporting other objects for future extension? If so, we may need to start from (heavy) refactoring.

For two array backends, it didn't make sense to write an abstraction layer for this, in part because it wasn't clear what we needed. But for three examples, it probably does -- that's the point where shared use cases become clear. Undoubtedly, there will be other cases in the future where users will want to extend xarray to handle new array types (arrays with units come to mind).

For implementing these overloads/functions, there are various possible solutions. Our current ad-hoc system is similar to what @hameerabbasi suggests -- we check the type of the first argument and use that to dispatch to an appropriate function. This has the advantage of being easy to implement for a known set of types, but a single dispatch order is not very extensible -- it's impossible to anticipate every third-party class. Recently, NumPy has moved away from this (e.g., with __array_ufunc__).

One appealing option is to make use of @mrocklin's multipledispatch library, which was originally developed for Blaze and is still in active use. Possible concerns: 1. Performance. Import times need to be fast, and I know this is something that multipledispatch can sometimes struggle with. My guess is that this wouldn't be a problem for us, since we can rely on other dispatch mechanisms most operations (including __array_ufunc__ and Python's builtin arithmetic overrides). 2. Dispatch for stack/concatenate: How do we handle dispatching for functions that take a list of arrays? e.g., if a list of arrays has contains any dask arrays, we need to use dask. Ideally, we would resolve the type of an object like [np.array(...), np.array(...), ..., da.Array(...)] to a mixed type like List[Union[np.ndarray, da.Array]], for which an override could be implemented. 3. Dispatch for the first argument(s) only: This is a minor point, but some functions don't need to be dispatched on all of their arguments, e.g., sum() only really needs to dispatch on the array types but can pass other arguments like axis directly on. I suppose could simply annotate extra position arguments with object, but this will get annoying for multiple optional arguments which would all need separate implementations (if I understand multipledispatch correctly).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  299668148
Powered by Datasette · Queries took 74.521ms · About: xarray-datasette