html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1938#issuecomment-510953379,https://api.github.com/repos/pydata/xarray/issues/1938,510953379,MDEyOklzc3VlQ29tbWVudDUxMDk1MzM3OQ==,1217238,2019-07-12T16:40:53Z,2019-07-12T16:40:53Z,MEMBER,"We're at the point where this could be hacked together pretty quickly: 1. We need to remove the explicit casting to NumPy arrays (ala https://github.com/pydata/xarray/pull/2956). Checking for an `__array_function__` attribute is probably a good heuristic for duck arrays (it's what dask is using). 2. Internally, we need to use NumPy functions directly (if `__array_function__` is enabled) instead of our current Dask/NumPy versions. Fortunately, pretty much all this logic lives in one place, in `xarray.core.duck_array_ops`. 3. We'll need to think a little bit about indexing in particular. Right now we have special indexing wrappers for NumPy arrays and Dask arrays; we would need to decide how to handle arbitrary array objects (probably by indexing them like NumPy arrays?). Basic indexing should work either way, but indexing with arrays can be a little tricky since few duck-array types support NumPy's full semantics (which are pretty complex).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-382918970,https://api.github.com/repos/pydata/xarray/issues/1938,382918970,MDEyOklzc3VlQ29tbWVudDM4MjkxODk3MA==,1217238,2018-04-20T00:04:43Z,2018-04-20T01:43:28Z,MEMBER,"I like `duckarray` a little better without the underscore. Should we go ahead and start `pydata/duckarray`? Or is it better to incubate in somebody's personal repo?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-382868997,https://api.github.com/repos/pydata/xarray/issues/1938,382868997,MDEyOklzc3VlQ29tbWVudDM4Mjg2ODk5Nw==,1217238,2018-04-19T20:23:39Z,2018-04-19T20:23:39Z,MEMBER,"This library would have hard dependencies only on numpy and multipledispatch, and would expose a multipledispatch namespace so extending it doesn't have to happen in the library itself.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-382867200,https://api.github.com/repos/pydata/xarray/issues/1938,382867200,MDEyOklzc3VlQ29tbWVudDM4Mjg2NzIwMA==,1217238,2018-04-19T20:17:19Z,2018-04-19T20:17:19Z,MEMBER,"By ""muktipledy"" I mean ""multipledispatch""(on my phone)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-382867083,https://api.github.com/repos/pydata/xarray/issues/1938,382867083,MDEyOklzc3VlQ29tbWVudDM4Mjg2NzA4Mw==,1217238,2018-04-19T20:16:49Z,2018-04-19T20:16:49Z,MEMBER,"Basically, the library would define functions like `concatenate` (everything in the linked sparse issue) using muktipledy with implementations for numpy, dask, sparse, etc.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-382859987,https://api.github.com/repos/pydata/xarray/issues/1938,382859987,MDEyOklzc3VlQ29tbWVudDM4Mjg1OTk4Nw==,1217238,2018-04-19T19:51:56Z,2018-04-19T19:51:56Z,MEMBER,"I'm thinking it could make sense to build this minimal library for ""duck typed arrays"" with multipledispatch outside of xarray. That would make it easier for library builders to use and extend it. Anyone interested in getting started o nthat?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368605364,https://api.github.com/repos/pydata/xarray/issues/1938,368605364,MDEyOklzc3VlQ29tbWVudDM2ODYwNTM2NA==,1217238,2018-02-26T18:45:13Z,2018-02-26T18:45:13Z,MEMBER,See https://github.com/mrocklin/multipledispatch/issues/72,"{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 1, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368598394,https://api.github.com/repos/pydata/xarray/issues/1938,368598394,MDEyOklzc3VlQ29tbWVudDM2ODU5ODM5NA==,1217238,2018-02-26T18:22:33Z,2018-02-26T18:22:33Z,MEMBER,"I made a tweaked version of dispatching to list subtypes, which probably suitable for use in xarray: https://drive.google.com/file/d/18zdyUpWLNFzFaz08GUOC5vs1GxE_jHg-/view?usp=sharing Example behavior: ```python @dispatch(List[int]) def f(args): print('integers:', args) @dispatch(List[str]) def f(args): print('strings:', args) @dispatch(List[str, int]) def f(args): print('mixed str-int:', args) f([1, 2]) # integers: [1, 2] f([1, 2, 'foo']) # mixed str-int: [1, 2, 'foo'] f(['foo', 'bar']) # strings: ['foo', 'bar'] f([[1, 2]]) # NotImplementedError: Could not find signature for f: ``` Differences from @llllllllll's `VarArgs`: - I don't actually subclass from `tuple`/`list`. You can't use the `List` constructor directly or do `issubclass` with list objects (this matches `typing.List`) - I added sugar so that you don't need to write the dispatch function for `list`, and implementations actually receive native Python list objects as arguments, not `VarArgs` instances. - Type caching is done based on the *set* of element types, not the sequence of element types. I think this is more performant/correct. It would be straightforward to adapt this to use `typing.List`, but since we'll want to define our own `dispatch` functions anyways for our own xarray-specific multipledispatch namespace, I'm just as happy to use an internal `xarray.dispatching.List` type.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368281147,https://api.github.com/repos/pydata/xarray/issues/1938,368281147,MDEyOklzc3VlQ29tbWVudDM2ODI4MTE0Nw==,1217238,2018-02-25T03:56:38Z,2018-02-25T03:56:38Z,MEMBER,"Indeed, typing support for multipledispatch looks it's a ways off. To be honest, the VarArgs solution looks a little ugly to me, so I'm not sure it's with enshrining in multipledispatch either. I guess that leaves putting our own ad-hoc solution on top of multipledispatch in xarray for now. Which really is totally fine -- this is all a stop gap measure until NumPy itself supports this sort of duck typing. On Sat, Feb 24, 2018 at 7:46 PM Joe Jevnik wrote: > Given the issues raised on that PR as well as the profiling results shown > here > > I think that PR will need some serious work before it could be merged. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368279019,https://api.github.com/repos/pydata/xarray/issues/1938,368279019,MDEyOklzc3VlQ29tbWVudDM2ODI3OTAxOQ==,1217238,2018-02-25T03:02:59Z,2018-02-25T03:02:59Z,MEMBER,"I spent some time thinking about this today. The cleanest answer is probably support for standard typing annotations in multipledispatch, at least for `List`. This is already being pursued for multipledispatch in https://github.com/mrocklin/multipledispatch/pull/69.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368268549,https://api.github.com/repos/pydata/xarray/issues/1938,368268549,MDEyOklzc3VlQ29tbWVudDM2ODI2ODU0OQ==,1217238,2018-02-24T23:25:49Z,2018-02-24T23:25:49Z,MEMBER,"> Is there a way to handle kwargs (not with types, but ignoring them)? Yes, `muiltipledispatch` already ignores all keyword arguments for purposes of dispatching.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368190478,https://api.github.com/repos/pydata/xarray/issues/1938,368190478,MDEyOklzc3VlQ29tbWVudDM2ODE5MDQ3OA==,1217238,2018-02-24T02:25:25Z,2018-02-24T02:25:25Z,MEMBER,"@mrocklin this is roughy what we would want in multipledispatch: https://github.com/blaze/blaze/blob/master/blaze/compute/varargs.py#L20-L90 This involves metaclasses, which frankly do blow my mind a little bit. Probably the magic could be tuned down a little bit, but metaclasses *are* necessary at least for implementing `__getitem__` syntax to create classes (and provide a few other niceties here like custom reprs and subclass checks).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368110090,https://api.github.com/repos/pydata/xarray/issues/1938,368110090,MDEyOklzc3VlQ29tbWVudDM2ODExMDA5MA==,1217238,2018-02-23T19:13:14Z,2018-02-23T19:13:14Z,MEMBER,"> How about something like checking inside a list if something is top priority, then call a, if second priority, call b, etc. Usually, this is not a good idea. The problem is that it's impossible to know a global priority order across unrelated packages. It's usually better to declare valid type matches explicitly. NumPy tried this with `__array_priority__`, but in practice these priority numbers are basically meaningless for all comparisons other than comparisons to the priority of NumPy arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368108543,https://api.github.com/repos/pydata/xarray/issues/1938,368108543,MDEyOklzc3VlQ29tbWVudDM2ODEwODU0Mw==,1217238,2018-02-23T19:07:46Z,2018-02-23T19:07:46Z,MEMBER,"As for my last concern, ""Dispatch for the first argument(s) only"" it looks like the simple answer is that multipledispatch already only dispatches based on positional arguments. So as long as we're strict about using keyword arguments for extra parameters like `axis` (which is good style anyways), we only need a single overload per array type for single dispatch functions like `sum()`. It looks like this resolves almost all of my concerns about using multiple dispatch. One thing that would be nice is it `VarArgs` is actually distributed as part of multipledispatch rather than needing to be copied separately into xarray. That would make it easier for third parties to extend our operations, by simply importing `VarArgs` from multipledispatch rather than importing it from somewhere internal in xarray.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368107036,https://api.github.com/repos/pydata/xarray/issues/1938,368107036,MDEyOklzc3VlQ29tbWVudDM2ODEwNzAzNg==,1217238,2018-02-23T19:02:34Z,2018-02-23T19:02:34Z,MEMBER,"Yes, I just tested out the wrapping dispatch. It works and is quite clean.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368097912,https://api.github.com/repos/pydata/xarray/issues/1938,368097912,MDEyOklzc3VlQ29tbWVudDM2ODA5NzkxMg==,1217238,2018-02-23T18:32:04Z,2018-02-23T18:32:04Z,MEMBER,"@llllllllll very cool! Is there a special trick I need to use this? I tried: ```python # first: pip install https://github.com/blaze/blaze/archive/master.tar.gz import blaze.compute from blaze.compute.varargs import VarArgs from multipledispatch import dispatch @dispatch(VarArgs[float]) def f(args): print('floats') @dispatch(VarArgs[str]) def f(args): print('strings') @dispatch(VarArgs[str, float]) def f(args): print('mixed') ``` This gives me an error when I try to use it: ```python >>> f(['foo']) --------------------------------------------------------------------------- KeyError Traceback (most recent call last) /usr/local/lib/python3.6/dist-packages/multipledispatch/dispatcher.py in __call__(self, *args, **kwargs) 154 try: --> 155 func = self._cache[types] 156 except KeyError: KeyError: (,) During handling of the above exception, another exception occurred: NotImplementedError Traceback (most recent call last) in () ----> 1 f(['foo']) /usr/local/lib/python3.6/dist-packages/multipledispatch/dispatcher.py in __call__(self, *args, **kwargs) 159 raise NotImplementedError( 160 'Could not find signature for %s: <%s>' % --> 161 (self.name, str_signature(types))) 162 self._cache[types] = func 163 try: NotImplementedError: Could not find signature for f: ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368084600,https://api.github.com/repos/pydata/xarray/issues/1938,368084600,MDEyOklzc3VlQ29tbWVudDM2ODA4NDYwMA==,1217238,2018-02-23T17:44:27Z,2018-02-23T18:17:28Z,MEMBER,"Dispatch for stack/concatenate is definitely on the radar for NumPy development, but I don't know when it's actually going to happen. The likely interface is something like `__array_ufunc__`: a special method like `__array_concatenate__` is called on each element in the list, until one does not return NotImplemented. This is a different style of overloads than multipledispatch, one that is slightly simpler to implement but possibly slower and with fewer guarantees of correctness. We only need this for a couple of operations, so in any case we can probably implement our own ad-hoc dispatch system for `np.stack` and `np.concatenate`, either along the of multipledispatch or NumPy/`__array_ufunc__`. On further contemplation, overloading based on union types with a system like multipledispatch does seem tricky. It's not clear to me that there's even a well defined type for inputs to concatenate that should be dispatched to dask vs. numpy, for example. We want to let that dask handle any cases where at least one input is a dask array, but a type like `List[Union[np.ndarray, da.Array]]` actually matches a list of all numpy arrays, too -- unless we require an exact match for the type.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148 https://github.com/pydata/xarray/issues/1938#issuecomment-368066239,https://api.github.com/repos/pydata/xarray/issues/1938,368066239,MDEyOklzc3VlQ29tbWVudDM2ODA2NjIzOQ==,1217238,2018-02-23T16:47:53Z,2018-02-23T16:47:53Z,MEMBER,"> Do we need to be capable of supporting other objects for future extension? If so, we may need to start from (heavy) refactoring. For two array backends, it didn't make sense to write an abstraction layer for this, in part because it wasn't clear what we needed. But for three examples, it probably does -- that's the point where shared use cases become clear. Undoubtedly, there will be other cases in the future where users will want to extend xarray to handle new array types (arrays with units come to mind). For implementing these overloads/functions, there are various possible solutions. Our current ad-hoc system is similar to what @hameerabbasi suggests -- we check the type of the first argument and use that to dispatch to an appropriate function. This has the advantage of being easy to implement for a known set of types, but a single dispatch order is not very extensible -- it's impossible to anticipate every third-party class. Recently, NumPy has moved away from this (e.g., with `__array_ufunc__`). One appealing option is to make use of @mrocklin's [multipledispatch](https://github.com/mrocklin/multipledispatch) library, which was originally developed for Blaze and is still in active use. Possible concerns: 1. **Performance**. Import times need to be fast, and I know this is something that `multipledispatch` can sometimes struggle with. My *guess* is that this wouldn't be a problem for us, since we can rely on other dispatch mechanisms most operations (including `__array_ufunc__` and Python's builtin arithmetic overrides). 2. **Dispatch for `stack`/`concatenate`**: How do we handle dispatching for functions that take a list of arrays? e.g., if a list of arrays has contains any dask arrays, we need to use dask. Ideally, we would resolve the type of an object like `[np.array(...), np.array(...), ..., da.Array(...)]` to a mixed type like `List[Union[np.ndarray, da.Array]]`, for which an override could be implemented. 3. **Dispatch for the first argument(s) only**: This is a minor point, but some functions don't need to be dispatched on all of their arguments, e.g., `sum()` only really needs to dispatch on the array types but can pass other arguments like `axis` directly on. I suppose could simply annotate extra position arguments with `object`, but this will get annoying for multiple optional arguments which would all need separate implementations (if I understand multipledispatch correctly).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,299668148