html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/475#issuecomment-633185598,https://api.github.com/repos/pydata/xarray/issues/475,633185598,MDEyOklzc3VlQ29tbWVudDYzMzE4NTU5OA==,1217238,2020-05-24T06:18:00Z,2020-05-24T06:21:03Z,MEMBER,@JimmyGao0204 I moved your comment to a new issue: https://github.com/pydata/xarray/issues/4090,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-355085272,https://api.github.com/repos/pydata/xarray/issues/475,355085272,MDEyOklzc3VlQ29tbWVudDM1NTA4NTI3Mg==,1217238,2018-01-03T18:16:29Z,2018-01-03T18:16:29Z,MEMBER,@jhamman @stefanomattia can you share a link to this blog post? :),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-342577675,https://api.github.com/repos/pydata/xarray/issues/475,342577675,MDEyOklzc3VlQ29tbWVudDM0MjU3NzY3NQ==,1217238,2017-11-07T18:31:30Z,2017-11-07T18:31:30Z,MEMBER,"Yes, a documentation example would be greatly appreciated. We have been making progress in this direction (especially with the new vectorised indexing support) but it has been slow going to do it right. On Tue, Nov 7, 2017 at 10:29 AM Benjamin Root wrote: > Yeah, we need to move something forward, because the main benefit of > xarray is the ability to manage datasets from multiple sources in a > consistent way. And data from different sources will almost always be in > different projections. > > My current problem that I need to solve right now is that I am ingesting > model data that is in a LCC projection and ingesting radar data that is in > a simple regular lat/lon grid. Both dataset objects have latitude and > longitude coordinate arrays, I just need to get both datasets to have the > same lat/lon grid. > > I guess I could continue using my old scipy-based solution (using > map_coordinates() or RectBivariateSpline), but at the very least, it would > make sense to have some documentation demonstrating how one might go about > this very common problem, even if it is showing how to use the scipy-based > tools with xarrays. If that is of interest, I can see what I can write up > after I am done my immediate task. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-241821366,https://api.github.com/repos/pydata/xarray/issues/475,241821366,MDEyOklzc3VlQ29tbWVudDI0MTgyMTM2Ng==,1217238,2016-08-23T18:05:09Z,2017-02-09T23:21:14Z,MEMBER,"A few recent developments relevant to this issue: - #974 discusses how we could add multi-dimensional indexing with broadcasting. This would subsume the need for separate methods like `sel_points` and allow also handle indexing grids with grids. - #947 adds first class support for MultiIndex coordinates into xarray. This is good model for how a KDTree could work. So I'm now thinking an API more like this: ``` >>> ds = ds.set_kdtree(spatial_index=['latitude', 'longitude']) >>> ds Dimensions: (x: 4, y: 5) Coordinates: * x (x) int64 0 1 2 3 * y (y) int64 0 1 2 3 4 * spatial_index (x, y) KDTree - latitude (x, y) float64 0.49 0.5682 -0.3541 -0.9305 -0.9669 0.01558 ... - longitude (x, y) float64 0.3758 1.429 -1.698 -1.344 0.5237 0.6152 ... Data variables: temperature (x, y) float64 0.5735 -0.4871 0.4708 0.4907 -0.3318 0.2883 ... >>> result = ds.sel(latitude=other.latitude, longitude=other.longitude, ... method='nearest') ``` For building a tree with lat/lon remapped to spherical coordinates, we should write a method that converts lat and lon arrays into a tuple of x, y, z arrays (e.g., using `apply_ufunc` from #964). Then this looks like `ds.set_kdtree(spatial_index=latlon_to_xyy(ds.latitude, ds.longitude))`. Conceivably, we could add some sugar for this, e.g., `ds.geo.set_kdtree(spatial_index=['latitude', 'longitude'])`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-256207074,https://api.github.com/repos/pydata/xarray/issues/475,256207074,MDEyOklzc3VlQ29tbWVudDI1NjIwNzA3NA==,1217238,2016-10-25T23:19:03Z,2016-10-25T23:19:03Z,MEMBER,"@burnpanck Nevermind, you are correct! I misread your comment. This cannot be done currently. You certainly could try to put this into `isel_points`, and if you can do it in a clean fashion I an open to accepting it, but keep in mind that the method is going to go away when we finally get around to implementing #974. Work on #974 would probably be more productive, ultimately. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-256201020,https://api.github.com/repos/pydata/xarray/issues/475,256201020,MDEyOklzc3VlQ29tbWVudDI1NjIwMTAyMA==,1217238,2016-10-25T22:49:14Z,2016-10-25T22:49:14Z,MEMBER,"@burnpanck I don't think you need to do the flattening/multi-index bit. I believe `isel_points`/`sel_points` _should_ just work for you already. At this point we're really just talking about design refinements (I'll rename the topic). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-126851732,https://api.github.com/repos/pydata/xarray/issues/475,126851732,MDEyOklzc3VlQ29tbWVudDEyNjg1MTczMg==,1217238,2015-08-01T02:29:37Z,2015-08-01T02:29:37Z,MEMBER,"PR #507 implements the my suggested 1d version of `sel_points`. Maybe we also want `reindex_points`, i.e., pointwise indexing by label that is gauranteed to succeed even if some labels are missing? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-125468579,https://api.github.com/repos/pydata/xarray/issues/475,125468579,MDEyOklzc3VlQ29tbWVudDEyNTQ2ODU3OQ==,1217238,2015-07-28T06:43:26Z,2015-07-28T06:43:26Z,MEMBER,"I started playing around with making an array wrapper for KDTree this evening: https://gist.github.com/shoyer/ae30a1200f749c84b4c4 I think it has most of the necessary indexing machinery and you can put it in an xray.Dataset like an array. You could easily imagine hooking in a `transform` argument to `KDTreeIndex` to handle projection. But of course it hasn't been hooked up to any API yet. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-125349079,https://api.github.com/repos/pydata/xarray/issues/475,125349079,MDEyOklzc3VlQ29tbWVudDEyNTM0OTA3OQ==,1217238,2015-07-27T21:34:42Z,2015-07-27T21:34:42Z,MEMBER,"I would start with the easiest case -- lookups of 1d orthogonal arrays, e.g., `grid.sel(latitude=stations.latitude, longitude=stations.longitude, method='nearest')`. This would very straightforwardly leverage our current machinery. For 2D lookups, we need a KDTree. Here are some API ideas, just tossing things around... ``` >>> ds Dimensions: (x: 4, y: 5) Coordinates: latitude (x, y) float64 0.49 0.5682 -0.3541 -0.9305 -0.9669 0.01558 ... longitude (x, y) float64 0.3758 1.429 -1.698 -1.344 0.5237 0.6152 ... * x (x) int64 0 1 2 3 * y (y) int64 0 1 2 3 4 Data variables: temperature (x, y) float64 0.5735 -0.4871 0.4708 0.4907 -0.3318 0.2883 ... # perhaps set_ndindex is a better name? >>> ds = ds.set_kdtree(['latitude', 'longitude'], name='latlon_index', method='spherical') >>> ds Dimensions: (x: 4, y: 5) Coordinates: latitude (x, y) float64 0.49 0.5682 -0.3541 -0.9305 -0.9669 0.01558 ... longitude (x, y) float64 0.3758 1.429 -1.698 -1.344 0.5237 0.6152 ... * latlon_index (x, y) float64 (0.49, 0.3758) (0.5682, 1.429) ... * x (x) int64 0 1 2 3 * y (y) int64 0 1 2 3 4 Data variables: temperature (x, y) float64 0.5735 -0.4871 0.4708 0.4907 -0.3318 0.2883 ... result = ds.sel_points(latitude=other.latitude, longitude=other.longitude, method='nearest') ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-122440826,https://api.github.com/repos/pydata/xarray/issues/475,122440826,MDEyOklzc3VlQ29tbWVudDEyMjQ0MDgyNg==,1217238,2015-07-17T23:05:59Z,2015-07-17T23:05:59Z,MEMBER,"> Any suggestions on how to index the dask array without looping through individual points would be great. For now, I actually think selecting individual points and then concatenating the resulting arrays together would be a reasonable start. Yes, it's kind of slow, but once you have a first draft put together that way with the right API we can optimize later. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-122001943,https://api.github.com/repos/pydata/xarray/issues/475,122001943,MDEyOklzc3VlQ29tbWVudDEyMjAwMTk0Mw==,1217238,2015-07-16T15:59:18Z,2015-07-16T15:59:18Z,MEMBER,"@jhamman it would be great if you could put together a PR for `isel_points`. The main complexity is that you'll want to write a version that also works with dask arrays. Let me know if that part is confusing, I can certainly help with that. As for `sel_points`, we only need a kdtree if the underlying coordinates are 2D. If `latitude` and `longitude` (for example) are 1d, we can just use the existing machinery for remapping label based indexers to integers. This should be pretty straightforward, following the example of `isel`: https://github.com/xray/xray/blob/v0.5.1/xray/core/dataset.py#L1024 https://github.com/xray/xray/blob/v0.5.1/xray/core/indexing.py#L157 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-121808018,https://api.github.com/repos/pydata/xarray/issues/475,121808018,MDEyOklzc3VlQ29tbWVudDEyMTgwODAxOA==,1217238,2015-07-16T02:47:30Z,2015-07-16T02:47:30Z,MEMBER,"I agree that regridding and resample would be very nice, and pyresample looks like a decent option. I have no immediate plans to implement these features but contributions would be very welcome. For n-dimensional indexing, kdtree seems sensible, especially if we can cache it on the coordinates. We probably want an explicit API for methods that add new coordinates -- something like `ds.set_kdtree(['latitude', 'longitude'])`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-121703276,https://api.github.com/repos/pydata/xarray/issues/475,121703276,MDEyOklzc3VlQ29tbWVudDEyMTcwMzI3Ng==,1217238,2015-07-15T18:22:03Z,2015-07-15T18:22:03Z,MEMBER,"> Seems like if your method is going to be named sel_points then points is a reasonable dimension name. Yes, this is a reasonable choice for the case of 1d indexers. > Maybe support a name kwarg? This is also a good idea, though I would probably call the parameter `dim`, not `name`. > One thing to keep in mind is that for many of us the ""nearest-neighbor"" part isn't really method='nearest', but instead more like, method='ingridcell' where the grid cell might be roughly square or might be something pretty different. Indeed. As a start, we should be able to do nearest neighbor lookups with a tolerance soon -- I have a pandas PR that should add some of that basic functionality (https://github.com/pydata/pandas/pull/10411). In the long term, it would be useful to have some sort of representation of grid cells in the index itself, possibly something similar to `IntervalIndex` (https://github.com/pydata/pandas/pull/8707). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700 https://github.com/pydata/xarray/issues/475#issuecomment-121679580,https://api.github.com/repos/pydata/xarray/issues/475,121679580,MDEyOklzc3VlQ29tbWVudDEyMTY3OTU4MA==,1217238,2015-07-15T16:58:36Z,2015-07-15T16:58:36Z,MEMBER,"So, the good news is that once we figure out the API for pointwise indexing, I think the nearest-neighbor part could be as simple as supplying `method='nearest'`. The challenge is that we want to go from an DataArray that looks like this: ``` In [4]: arr = xray.DataArray([[1, 2], [3, 4]], dims=['x', 'y']) In [5]: arr Out[5]: array([[1, 2], [3, 4]]) Coordinates: * x (x) int64 0 1 * y (y) int64 0 1 ``` To one that looks like that: ``` In [6]: xray.DataArray([1, 4], {'x': ('c', [0, 1]), 'y': ('c', [0, 1])}, dims='c') Out[6]: array([1, 4]) Coordinates: y (c) int64 0 1 x (c) int64 0 1 * c (c) int64 0 1 ``` Somehow, we need to figure out the name for the new dimension (`c` in this example). My thought would be to have methods `sel_points` and `isel_points` that work similarly to `sel` and `isel`. This is straightforward if you already have xray 1D objects with a labeled dimension: `arr.sel_points(x=x, y=y)`, where `x` and `y` are along the `c` dimension. If you don't already have 1D xray objects, I suppose we could also allow `arr.sel_points(x=('c', [0, 1]), y=('c', [0, 1]))` or `arr.sel_points('c', x=[0, 1], y=[0, 1])`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,95114700