html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4995#issuecomment-1057699042,https://api.github.com/repos/pydata/xarray/issues/4995,1057699042,IC_kwDOAMm_X84_CzTi,3604210,2022-03-03T05:47:56Z,2022-10-25T14:35:35Z,NONE,"@observingClouds I think a fill_value arg in sel as in reindex is still warranted. Although reindex as @dcherian suggested works for cases the dims match the target dims, in cases where the dims don't match, e.g., in the examples of sel: https://xarray.pydata.org/en/stable/generated/xarray.DataArray.sel.html. It'd cause error: `ValueError: Indexer has dimensions ('points',) that are different from that to be indexed along x` ","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-1290446119,https://api.github.com/repos/pydata/xarray/issues/4995,1290446119,IC_kwDOAMm_X85M6qUn,24661500,2022-10-25T12:11:45Z,2022-10-25T12:11:45Z,NONE,"I think the original scope of this issue is still valid. I also would expect that indices that are not within the tolerance would simply be dropped. While it might be nice in some situations, I don't really think that specifying a fill value is needed in order to accomplish this. The issue I'm facing with `reindex` is that it doesn't really scale as well as `sel` does, significantly reducing the amount of data I can handle. I would like to humbly suggest that there still might be interest in seeing this functionality. Unfortunately the testing logs from #4996 have expired so it's not clear why the tests failed for this PR before it was closed.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-1110101560,https://api.github.com/repos/pydata/xarray/issues/4995,1110101560,IC_kwDOAMm_X85CKs44,8699967,2022-04-26T18:09:18Z,2022-04-26T18:12:14Z,CONTRIBUTOR,"Example using `nearest` & `tolerance` with `reindex` & `sel` when dims don't match based on the example in `sel`: ```python import numpy import xarray da = xarray.DataArray( numpy.arange(25).reshape(5, 5), coords={""x"": numpy.arange(5), ""y"": numpy.arange(5)}, dims=(""x"", ""y""), ) tgt_x = numpy.linspace(0, 4, num=5) + 0.5 tgt_y = numpy.linspace(0, 4, num=5) + 0.5 da = da.reindex( x=tgt_x, y=tgt_y, method=""nearest"", tolerance=0.2, fill_value=numpy.nan ).sel( x=xarray.DataArray(tgt_x, dims=""points""), y=xarray.DataArray(tgt_y, dims=""points""), ) ``` Output: ``` array([nan, nan, nan, nan, nan]) Coordinates: x (points) float64 0.5 1.5 2.5 3.5 4.5 y (points) float64 0.5 1.5 2.5 3.5 4.5 Dimensions without coordinates: points ``` Side note: I don't think it makes sense to add `fill_value` to `sel` as it would require adding new coordinates that didn't exist previously. Calling `reindex` first makes that more clear in my opinion.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-799047819,https://api.github.com/repos/pydata/xarray/issues/4995,799047819,MDEyOklzc3VlQ29tbWVudDc5OTA0NzgxOQ==,43613877,2021-03-15T02:28:51Z,2021-03-15T02:28:51Z,CONTRIBUTOR,"Thanks @dcherian, this is doing the job. I'll close this issue as there seems to be no need to implement this into the `sel` method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-791145448,https://api.github.com/repos/pydata/xarray/issues/4995,791145448,MDEyOklzc3VlQ29tbWVudDc5MTE0NTQ0OA==,2448579,2021-03-05T04:32:29Z,2021-03-05T04:32:29Z,MEMBER,"Actually does `reindex` do what you want, the returned coordinate labels will be what you provide. ``` >>> ds.reindex(lat=[5,15,40], method=""nearest"", tolerance=5, fill_value=-999) array([1, 2, -999]) Coordinates: * lat (lat) int64 5 15 40 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-791021835,https://api.github.com/repos/pydata/xarray/issues/4995,791021835,MDEyOklzc3VlQ29tbWVudDc5MTAyMTgzNQ==,2448579,2021-03-04T23:16:00Z,2021-03-04T23:16:00Z,MEMBER,"> in using a fill_value is that the indexing has to modify the data ( insert e.g. -999) and also 'invent' a new coordinate point ( here 40). This seems totally doable though. > One fill_value might not fit to all data arrays In quite a few functions, fill_value can be a dict mapping variable name to a value so this is workable. Let's see what others think.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-791019238,https://api.github.com/repos/pydata/xarray/issues/4995,791019238,MDEyOklzc3VlQ29tbWVudDc5MTAxOTIzOA==,43613877,2021-03-04T23:10:11Z,2021-03-04T23:10:11Z,CONTRIBUTOR,"Introducing a `fill_value` seems like a good idea, such that the size of the output does not change compared to the intended selection. Choosing the original/requested coordinate as a label for the missing datapoint seems to be a valid choice because this position has been checked for valid data nearby without success. I would suggest, that the `fill_value` should then be automatically determined from the `_FillValue`, the datatype and only at last requires the `fill_value` to be set. However, the shortcoming that I see in using a `fill_value` is that the indexing has to modify the data ( insert e.g. `-999`) and also 'invent' a new coordinate point ( here `40`). This gets reasonably complex, when applying to a dataset with DataArrays of different types, e.g. ```python import numpy as np import xarray as xr ds = xr.Dataset() ds['data1'] = xr.DataArray(np.array([1,2,3,4,5], dtype=int), dims=[""lat""], coords={'lat':[10,20,30,50,60]}) ds['data2'] = xr.DataArray(np.array([1,2,3,4,5], dtype=float), dims=[""lat""], coords={'lat':[10,20,30,50,60]}) ``` One `fill_value` might not fit to all data arrays being it because of the datatype or the actual data. E.g. `-999` might be a good `fill_value` for one DataArray but a valid datapoint in another one. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976 https://github.com/pydata/xarray/issues/4995#issuecomment-790878651,https://api.github.com/repos/pydata/xarray/issues/4995,790878651,MDEyOklzc3VlQ29tbWVudDc5MDg3ODY1MQ==,2448579,2021-03-04T19:40:29Z,2021-03-04T19:40:29Z,MEMBER,"``` >>> ds.sel(lat=[5,15,40], method=""nearest"", tolerance=5) array([1, 2]) Coordinates: * lat (lat) int64 10 20 ``` This is a very surprising result, you've asked for values at three points but received two back. The following (specifying `fill_value`) seems like better behaviour to me but how do you choose the coordinate label (here I picked `40` since that was provided to `sel`) ``` >>> ds.sel(lat=[5,15,40], method=""nearest"", tolerance=5, fill_value=-999) array([1, 2, -999]) Coordinates: * lat (lat) int64 10 20 40 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976