github: issue_comments: 14 rows where author_association = "MEMBER" and issue = 340486433 sorted by updated

14 rows where author_association = "MEMBER" and issue = 340486433 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
507975592	https://github.com/pydata/xarray/issues/2281#issuecomment-507975592	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDUwNzk3NTU5Mg==	kmuehlbauer 5821660	2019-07-03T07:28:58Z	2019-07-03T07:28:58Z	MEMBER	Thanks for this interesting discussion. I'm currently at the point where I'm moving interpolation functions to xarray based workflow. While trying to wrap my head around this I found that this involves not only interpolation but also indexing (see #1603, #2195, #2986). Sorry if this might exceed the original intention of the issue. But it is my real use case (curvelinear grids to cartesian). citing @shoyer's comments for convenience In particular, SciPy's griddata either makes use of a `scipy.spatial.KDTree` (for nearest neighbor lookups) and `scipy.spatial.Delaunay`(for linear interpolation, on a triangular mesh). We could build these data structures once (and potentially even cache them in indexes on xarray objects), and likewise calculate the sparse interpolation coefficients once for repeated use. Anyways, as I've said above, I think it would be totally appropriate to build routines resembling scipy's griddata into `interp()` (but using the lower level KDTree/Delaunay interface). This will not be the most efficiency strategy, but should offer reasonable performance in most cases. Let's consider this open for contributions, if anyone is interested in putting together a pull request. Yes, if we cache the Delaunay triangulation we could probably do the entire thing in about the time it currently takes to do one time step. Our interpolators are build upon scipy's cKDTree. They are created once for some source and target grid configuration and then just called with the wanted data. The interpolator is cached in the dataset accessor for multiple use. But this makes only sense, if there are multiple variables within this dataset. I'm thinking about how to reuse the cached interpolator for other datasets with the same source and target configuration. Same would be true for tree-based indexers, if they become available in xarray. My current approach would be to create an xarray dataset `dsT` with source dimension/coordinates (and target dimensions/coordinates) and the created tree. If the source has some projection attached one could give another projection target and the target dimensions/coordinates will be created accordingly (but this could be wrapped by other packages, like geoxarray, ping @djhoese). One could even precalculate target dists, idx from the tree for faster access (I do this). Finally there should be something like `ds_res = ds_src.interp_like(dsT)` where one can reuse this dataset. I'm sure I can get something working within my workflow using accessors but it would be better fitted in xarray itself imho.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497629988	https://github.com/pydata/xarray/issues/2281#issuecomment-497629988	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzYyOTk4OA==	shoyer 1217238	2019-05-31T08:46:38Z	2019-05-31T08:46:38Z	MEMBER	Yes, if we cache the Delaunay triangulation we could probably do the entire thing in about the time it currently takes to do one time step. On Thu, May 30, 2019 at 10:50 AM Fernando Paolo notifications@github.com wrote: @shoyer https://github.com/shoyer and @crusaderky https://github.com/crusaderky That's right, that is how I was actually dealing with this problem prior trying xarray ... by flattening the grid coordinates and performing either gridding (with scipy's griddata) or interpolation (with scipy's map_coordinate) ... instead of performing proper regridding (from cube to cube without having to flatten anything). As a rule of thumb, any fancy algorithm should first exist for numpy-only data and then potentially it can be wrapped by the xarray library. This is important information. For the record, here is so far what I found to be best performant: import xarray as xr from scipy.interpolate import griddata Here x/y are dummy 1D coords that wont be used. da1 = xr.DataArray(cube1, [('t', t_cube1) , ('y', range(cube1.shape[1])), ('x', range(cube1.shape[2]))]) Regrid t_cube1 onto t_cube2 first since time will always map 1 to 1 between cubes. This operation is very fast. print('regridding in time ...') cube1 = da1.interp(t=t_cube2).values Regrid each 2D field (X_cube1/Y_cube1 onto X_cube2/Y_cube2) one at a time print('regridding in space ...') cube3 = np.full_like(cube2, np.nan) for k in range(t_cube1.shape[0]): print('regridding:', k) cube3[:,:,k] = griddata((X_cube1.ravel(), Y_cube1.ravel()), cube1[k,:,:].ravel(), (X_cube2, Y_cube2), fill_value=np.nan, method='linear') Performance is not that bad... for ~150 time steps and ~1500 nodes in x and y it takes less than 10-15 min. I think this can be sped up by computing and saving the interpolation weights between grids in the first iteration and cache them (I think xESMF does this). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/2281?email_source=notifications&email_token=AAJJFVQNLZ3SUTY2WIMI3E3PYAHWVA5CNFSM4FJQZDP2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWTATPA#issuecomment-497420732, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJJFVV5OXUPV25D64WJEZTPYAHWVANCNFSM4FJQZDPQ .	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497473031	https://github.com/pydata/xarray/issues/2281#issuecomment-497473031	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzQ3MzAzMQ==	shoyer 1217238	2019-05-30T20:24:56Z	2019-05-30T20:24:56Z	MEMBER	@fspaolo where does that huge number come from? I thought you said you have 1500 nodes in total. Did you select a single point on the t dimension before you applied bisplrep? 2665872 is roughly 1600^2. Also, (pardon the ignorance, I never dealt with geographical data) what kind of information does having your lat and lon being bidimensional convey? Does it imply `lat[i, j] < lat[i +1, j] and lon[i, j] < lon[i, j+1]` for any possible (i, j)? I think this is true sometimes but not always. The details depend on the geographic projection, but generally a good mesh has some notion of locality -- nearby locations in real space (i.e., on the globe) should also nearby in projected space. Anyways, as I've said above, I think it would be totally appropriate to build routines resembling scipy's griddata into `interp()` (but using the lower level KDTree/Delaunay interface). This will not be the most efficiency strategy, but should offer reasonable performance in most cases. Let's consider this open for contributions, if anyone is interested in putting together a pull request.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497468930	https://github.com/pydata/xarray/issues/2281#issuecomment-497468930	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzQ2ODkzMA==	crusaderky 6213168	2019-05-30T20:12:29Z	2019-05-30T20:12:29Z	MEMBER	@fspaolo where does that huge number come from? I thought you said you have 1500 nodes in total. Did you select a single point on the t dimension before you applied bisplrep? Also, (pardon the ignorance, I never dealt with geographical data) what kind of information does having your lat and lon being bidimensional convey? Does it imply `lat[i, j] < lat[i +1, j] and lon[i, j] < lon[i, j+1]` for any possible (i, j)?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497458053	https://github.com/pydata/xarray/issues/2281#issuecomment-497458053	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzQ1ODA1Mw==	shoyer 1217238	2019-05-30T19:38:43Z	2019-05-30T19:38:43Z	MEMBER	The naive implementation of splines involves inverting an N x N matrix where N is the total number of grid points. So it definitely is not a very scalable technique.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497254984	https://github.com/pydata/xarray/issues/2281#issuecomment-497254984	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzI1NDk4NA==	crusaderky 6213168	2019-05-30T08:45:16Z	2019-05-30T08:50:13Z	MEMBER	I did not test it but this looks like what you want: ``` from scipy.interpolate import bisplrep, bisplev x = cube1.x.values.ravel() y = cube1.y.values.ravel() z = cube1.values.ravel() x_new = cube2.x.values.ravel() y_new = cube2.y.values.ravel() tck = bisplrep(x, y, z) z_new = bisplev(x_new, y_new, tck) z_new = z_new.reshape(cube2.shape) cube3 = xarray.DataArray(z_new, dims=cube2.dims, coords=cube2.coords) ``` I read above that you have concerns about performance as the above does not understand the geometry of the input data - did you run performance tests on it already? [EDIT] you will probably need to break down your problem on 1-point slices along dimension t before you apply the above.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497251626	https://github.com/pydata/xarray/issues/2281#issuecomment-497251626	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzI1MTYyNg==	crusaderky 6213168	2019-05-30T08:33:16Z	2019-05-30T08:33:51Z	MEMBER	@fspaolo sorry, I should have taken more time re-reading the initial post. No, xarray_extras.interpolate does not do the kind of interpolation you want. Have you looked into scipy? https://docs.scipy.org/doc/scipy/reference/interpolate.html#multivariate-interpolation xarray is just a wrapper, and if scipy does what you need, it's trivial to unwrap your DataArray into a bunch of numpy arrays, feed them into scipy, and then re-wrap the output numpy arrays into a DataArray. On the other hand, if scipy does not do what you want, then I suspect that opening a feature request on the scipy tracker would be a much better place than the xarray board. As a rule of thumb, any fancy algorithm should first exist for numpy-only data and then potentially it can be wrapped by the xarray library.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497150401	https://github.com/pydata/xarray/issues/2281#issuecomment-497150401	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzE1MDQwMQ==	shoyer 1217238	2019-05-29T23:58:42Z	2019-05-29T23:58:42Z	MEMBER	So how to perform this operation... or am I missing something? Sorry, i don't think there's an easy way to do this directly in xarray right now. My concern with `scipy.interpolate.griddata` is that the performance might be miserable... `griddata` takes an arbitrary stream of data points in a D-dimensional space. It doesn't know if those source data points have a gridded/mesh structure. A curvilinear grid mesh needs to be flatten into a stream of points before passed to `griddata()`. Might not be too bad for nearest-neighbour search, but very inefficient for linear/bilinear method, where knowing the mesh structure beforehand can save a lot of computation. Thinking a little more about this, I wonder if this the performance could actually be OK as long as the spatial grid is not too big, i.e., if we reuse the same grid many times for different variables/times. In particular, SciPy's griddata either makes use of a `scipy.spatial.KDTree` (for nearest neighbor lookups) and `scipy.spatial.Delaunay`(for linear interpolation, on a triangular mesh). We could build these data structures once (and potentially even cache them in indexes on xarray objects), and likewise calculate the sparse interpolation coefficients once for repeated use.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
497130177	https://github.com/pydata/xarray/issues/2281#issuecomment-497130177	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NzEzMDE3Nw==	crusaderky 6213168	2019-05-29T22:22:01Z	2019-05-29T22:25:45Z	MEMBER	@fspaolo 2d mesh interpolation and 1d interpolation with extra "free" dimensions are fundamentally different algorithms. Look up the scipy documentation on the various interpolation functions available. I don't understand what you are trying to pass for x_new and y_new and it definitely doesn't sound right. Right now you have a 3d DataArray with dimensions (x, y, t) and 3 coords, each of which is a 1d numpy array (e.g. `da.coords.x.values`). If you want to rescale, you need to pass a 1d numpy array or array-like for x_new, and another separate 1d array for y_new. You are not doing that, as the error message you're receiving is saying that your x_new is a numpy array with 2 or more dimensions, which the algorithm doesn't know what to do with. It can accept a multi-dimensional DataArrays with brand new dimensions, but that does not sound like it's your case.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
495871201	https://github.com/pydata/xarray/issues/2281#issuecomment-495871201	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NTg3MTIwMQ==	crusaderky 6213168	2019-05-25T06:55:33Z	2019-05-25T06:59:03Z	MEMBER	@fspaolo I never tried using my algorithm to perform 2D interpolation, but this should work: ``` from xarray_extras.interpolate import splrep, splev da = splev(x_new, splrep(da, 'x')) da = splev(y_new, splrep(da, 'y')) da = splev(t_new, splrep(da, 't')) ``` Add k=1 to downgrade from cubic to linear interpolation and get a speed boost. You can play around with dask to increase performance by using all your CPUs (or more with dask distributed), although you have to remember that an original dim can't be broken on multiple chunks when you apply splrep to it: `from xarray_extras.interpolate import splrep, splev da = da.chunk(t=TCHUNK) da = splev(x_new, splrep(da, 'x')) da = splev(y_new, splrep(da, 'y')) da = da.chunk(x=SCHUNK, y=SCHUNK).chunk(t=-1) da = splev(t_new, splrep(da, 't')) da = da.compute()` where TCHUNK and SCHUNK are integers you'll have to play with. The rule of thumb is that you want to have your chunks 5~100 MBs each. If you end up finding out that chunking along an interpolation dimension is important for you, it is possible to implement it with dask ghosting techniques, just painfully complicated.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
495515463	https://github.com/pydata/xarray/issues/2281#issuecomment-495515463	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQ5NTUxNTQ2Mw==	crusaderky 6213168	2019-05-24T08:10:10Z	2019-05-24T08:10:10Z	MEMBER	I am not aware of a ND mesh interpolation algorithm. However, my package xarray_extras [1] offers highly optimized 1D interpolation on a ND hypercube, on any numerical coord (not just time). You may try applying it 3 times on each dimension in sequence and see if you get what you want - although performance won't be optimal. [1] https://xarray-extras.readthedocs.io/en/latest/ Alternatively, if you do find the exact algorithm you want, but it's for numpy, then applying it to xarray is simple - just get DataArray.values -> apply function -> create new DataArray from the output.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
404744836	https://github.com/pydata/xarray/issues/2281#issuecomment-404744836	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQwNDc0NDgzNg==	shoyer 1217238	2018-07-13T07:00:16Z	2018-07-13T07:00:16Z	MEMBER	I'd like to figure out interfaces that make it possible for external, grid aware libraries to extend indexing and interpolation features in xarray. In particular, it would be nice to be able to associate a "grid index" used for caching computation that gets passed on in all xarray operations.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
404611922	https://github.com/pydata/xarray/issues/2281#issuecomment-404611922	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQwNDYxMTkyMg==	shoyer 1217238	2018-07-12T18:45:35Z	2018-07-12T18:45:35Z	MEMBER	I think we could make `dr.interp(xc=lon, yc=lat)` work for the N-D -> M-D case by wrapping `scipy.interpolate.griddata`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433
404407676	https://github.com/pydata/xarray/issues/2281#issuecomment-404407676	https://api.github.com/repos/pydata/xarray/issues/2281	MDEyOklzc3VlQ29tbWVudDQwNDQwNzY3Ng==	fujiisoup 6815844	2018-07-12T06:48:18Z	2018-07-12T06:48:18Z	MEMBER	Thanks, @JiaweiZhuang Not yet. `interp()` only works on N-dimensional regular grid. Under the hood, we are just using `scipy.interpolate.interp1d` and `interpn`. I am happy to see curvilinear interpolation in xarray if we could find a good general API for N-dimensional array. Do you have any proposal? For curvilinear interpolation, we may have some arbitrariness, e.g. `python dr_out = dr.interp(xc=lon)` the resultant dimension is not well determined. Maybe we need some limitation for the arguments.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Does interp() work on curvilinear grids (2D coordinates) ? 340486433

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

14 rows where author_association = "MEMBER" and issue = 340486433 sorted by updated_at descending

Here x/y are dummy 1D coords that wont be used.

Regrid t_cube1 onto t_cube2 first since time will always map 1 to 1 between cubes.

This operation is very fast.

Regrid each 2D field (X_cube1/Y_cube1 onto X_cube2/Y_cube2) one at a time

Advanced export