html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2281#issuecomment-497629988,https://api.github.com/repos/pydata/xarray/issues/2281,497629988,MDEyOklzc3VlQ29tbWVudDQ5NzYyOTk4OA==,1217238,2019-05-31T08:46:38Z,2019-05-31T08:46:38Z,MEMBER,"Yes, if we cache the Delaunay triangulation we could probably do the entire
thing in about the time it currently takes to do one time step.

On Thu, May 30, 2019 at 10:50 AM Fernando Paolo <notifications@github.com>
wrote:

> @shoyer <https://github.com/shoyer> and @crusaderky
> <https://github.com/crusaderky> That's right, that is how I was actually
> dealing with this problem prior trying xarray ... by flattening the grid
> coordinates and performing either *gridding* (with scipy's griddata) or
> *interpolation* (with scipy's map_coordinate) ... instead of performing
> proper *regridding* (from cube to cube without having to flatten
> anything).
>
> As a rule of thumb, any fancy algorithm should first exist for numpy-only
> data and then potentially it can be wrapped by the xarray library.
>
> This is important information.
>
> For the record, here is so far what I found to be best performant:
>
> import xarray as xr
> from scipy.interpolate import griddata
>
> # Here x/y are dummy 1D coords that wont be used.
> da1 = xr.DataArray(cube1, [('t', t_cube1) , ('y', range(cube1.shape[1])), ('x', range(cube1.shape[2]))])
>
> # Regrid t_cube1 onto t_cube2 first since time will always map 1 to 1 between cubes.
> # This operation is very fast.
> print('regridding in time ...')
> cube1 = da1.interp(t=t_cube2).values
>
> # Regrid each 2D field (X_cube1/Y_cube1 onto X_cube2/Y_cube2) one at a time
> print('regridding in space ...')
> cube3 = np.full_like(cube2, np.nan)
> for k in range(t_cube1.shape[0]):
>     print('regridding:', k)
>     cube3[:,:,k] = griddata((X_cube1.ravel(), Y_cube1.ravel()),
>                                            cube1[k,:,:].ravel(),
>                                            (X_cube2, Y_cube2),
>                                            fill_value=np.nan,
>                                            method='linear')
>
> Performance is not that bad... for ~150 time steps and ~1500 nodes in x
> and y it takes less than 10-15 min.
>
> I think this can be sped up by computing and saving the interpolation
> weights between grids in the first iteration and cache them (I think xESMF
> does this).
>
> —
> You are receiving this because you were mentioned.
>
>
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/issues/2281?email_source=notifications&email_token=AAJJFVQNLZ3SUTY2WIMI3E3PYAHWVA5CNFSM4FJQZDP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWTATPA#issuecomment-497420732>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAJJFVV5OXUPV25D64WJEZTPYAHWVANCNFSM4FJQZDPQ>
> .
>
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433
https://github.com/pydata/xarray/issues/2281#issuecomment-497473031,https://api.github.com/repos/pydata/xarray/issues/2281,497473031,MDEyOklzc3VlQ29tbWVudDQ5NzQ3MzAzMQ==,1217238,2019-05-30T20:24:56Z,2019-05-30T20:24:56Z,MEMBER,"> @fspaolo where does that huge number come from? I thought you said you have 1500 nodes in total. Did you select a single point on the t dimension before you applied bisplrep?

2665872 is roughly 1600^2.

> Also, (pardon the ignorance, I never dealt with geographical data) what kind of information does having your lat and lon being bidimensional convey? Does it imply `lat[i, j] < lat[i +1, j] and lon[i, j] < lon[i, j+1]` for any possible (i, j)?

I think this is true sometimes but not always. The details depend on the [geographic projection](https://en.wikipedia.org/wiki/Map_projection), but generally a good mesh has some notion of locality -- nearby locations in real space (i.e., on the globe) should also nearby in projected space.

------------------------

Anyways, as I've said above, I think it would be totally appropriate to build routines resembling scipy's griddata into `interp()` (but using the lower level KDTree/Delaunay interface). This will not be the most efficiency strategy, but should offer reasonable performance in most cases. Let's consider this open for contributions, if anyone is interested in putting together a pull request.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433
https://github.com/pydata/xarray/issues/2281#issuecomment-497458053,https://api.github.com/repos/pydata/xarray/issues/2281,497458053,MDEyOklzc3VlQ29tbWVudDQ5NzQ1ODA1Mw==,1217238,2019-05-30T19:38:43Z,2019-05-30T19:38:43Z,MEMBER,The naive implementation of splines involves inverting an N x N matrix where N is the total number of grid points. So it definitely is not a very scalable technique.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433
https://github.com/pydata/xarray/issues/2281#issuecomment-497150401,https://api.github.com/repos/pydata/xarray/issues/2281,497150401,MDEyOklzc3VlQ29tbWVudDQ5NzE1MDQwMQ==,1217238,2019-05-29T23:58:42Z,2019-05-29T23:58:42Z,MEMBER,"> So how to perform this operation... or am I missing something?

Sorry, i don't think there's an easy way to do this directly in xarray right now.

> My concern with `scipy.interpolate.griddata` is that the performance might be miserable... `griddata` takes an arbitrary **stream** of data points in a D-dimensional space. It doesn't know if those source data points have a gridded/mesh structure. A curvilinear grid mesh needs to be flatten into a stream of points before passed to `griddata()`. Might not be too bad for nearest-neighbour search, but very inefficient for linear/bilinear method, where knowing the mesh structure beforehand can save a lot of computation.

Thinking a little more about this, I wonder if this the performance could actually be OK as long as the spatial grid is not too big, i.e., if we reuse the same grid many times for different variables/times.

In particular, SciPy's griddata either makes use of a `scipy.spatial.KDTree` (for nearest neighbor lookups) and `scipy.spatial.Delaunay`(for linear interpolation, on a triangular mesh). We could build these data structures once (and potentially even cache them in indexes on xarray objects), and likewise calculate the sparse interpolation coefficients once for repeated use.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433
https://github.com/pydata/xarray/issues/2281#issuecomment-404744836,https://api.github.com/repos/pydata/xarray/issues/2281,404744836,MDEyOklzc3VlQ29tbWVudDQwNDc0NDgzNg==,1217238,2018-07-13T07:00:16Z,2018-07-13T07:00:16Z,MEMBER,"I'd like to figure out interfaces that make it possible for external, grid aware libraries to extend indexing and interpolation features in xarray. In particular, it would be nice to be able to associate a ""grid index"" used for caching computation that gets passed on in all xarray operations.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433
https://github.com/pydata/xarray/issues/2281#issuecomment-404611922,https://api.github.com/repos/pydata/xarray/issues/2281,404611922,MDEyOklzc3VlQ29tbWVudDQwNDYxMTkyMg==,1217238,2018-07-12T18:45:35Z,2018-07-12T18:45:35Z,MEMBER,"I think we could make `dr.interp(xc=lon, yc=lat)` work for the N-D -> M-D case by wrapping `scipy.interpolate.griddata`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,340486433