html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2281#issuecomment-507975592,https://api.github.com/repos/pydata/xarray/issues/2281,507975592,MDEyOklzc3VlQ29tbWVudDUwNzk3NTU5Mg==,5821660,2019-07-03T07:28:58Z,2019-07-03T07:28:58Z,MEMBER,"Thanks for this interesting discussion. I'm currently at the point where I'm moving interpolation functions to xarray based workflow. While trying to wrap my head around this I found that this involves not only interpolation but also indexing (see #1603, #2195, #2986). Sorry if this might exceed the original intention of the issue. But it is my real use case (curvelinear grids to cartesian).

citing @shoyer's comments for convenience
 
> In particular, SciPy's griddata either makes use of a `scipy.spatial.KDTree` (for nearest neighbor lookups) and `scipy.spatial.Delaunay`(for linear interpolation, on a triangular mesh). We could build these data structures once (and potentially even cache them in indexes on xarray objects), and likewise calculate the sparse interpolation coefficients once for repeated use.

> Anyways, as I've said above, I think it would be totally appropriate to build routines resembling scipy's griddata into `interp()` (but using the lower level KDTree/Delaunay interface). This will not be the most efficiency strategy, but should offer reasonable performance in most cases. Let's consider this open for contributions, if anyone is interested in putting together a pull request.

> Yes, if we cache the Delaunay triangulation we could probably do the entire thing in about the time it currently takes to do one time step.

Our interpolators are build upon scipy's cKDTree. They are created once for some source and target grid configuration and then just called with the wanted data. The interpolator is cached in the dataset accessor for multiple use. But this makes only sense, if there are multiple variables within this dataset. I'm thinking about how to reuse the cached interpolator for other datasets with the same source and target configuration. Same would be true for tree-based indexers, if they become available in xarray.

My current approach would be to create an xarray dataset `dsT` with source dimension/coordinates (and target dimensions/coordinates) and the created tree. If the source has some projection attached one could give another projection target and the target dimensions/coordinates will be created accordingly (but this could be wrapped by other packages, like geoxarray, ping @djhoese). One could even precalculate target dists, idx from the tree for faster access (I do this). Finally there should be something like `ds_res = ds_src.interp_like(dsT)` where one can reuse this dataset.

I'm sure I can get something working within my workflow using accessors but it would be better fitted in xarray itself imho. 


","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433