html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4995#issuecomment-1057699042,https://api.github.com/repos/pydata/xarray/issues/4995,1057699042,IC_kwDOAMm_X84_CzTi,3604210,2022-03-03T05:47:56Z,2022-10-25T14:35:35Z,NONE,"@observingClouds I think a fill_value arg in sel as in reindex is still warranted. Although reindex as @dcherian suggested works for cases the dims match the target dims, in cases where the dims don't match, e.g., in the examples of sel: https://xarray.pydata.org/en/stable/generated/xarray.DataArray.sel.html. It'd cause error:
`ValueError: Indexer has dimensions ('points',) that are different from that to be indexed along x`
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-1290446119,https://api.github.com/repos/pydata/xarray/issues/4995,1290446119,IC_kwDOAMm_X85M6qUn,24661500,2022-10-25T12:11:45Z,2022-10-25T12:11:45Z,NONE,"I think the original scope of this issue is still valid. I also would expect that indices that are not within the tolerance would simply be dropped. While it might be nice in some situations, I don't really think that specifying a fill value is needed in order to accomplish this.
The issue I'm facing with `reindex` is that it doesn't really scale as well as `sel` does, significantly reducing the amount of data I can handle. I would like to humbly suggest that there still might be interest in seeing this functionality.
Unfortunately the testing logs from #4996 have expired so it's not clear why the tests failed for this PR before it was closed.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-1110101560,https://api.github.com/repos/pydata/xarray/issues/4995,1110101560,IC_kwDOAMm_X85CKs44,8699967,2022-04-26T18:09:18Z,2022-04-26T18:12:14Z,CONTRIBUTOR,"Example using `nearest` & `tolerance` with `reindex` & `sel` when dims don't match based on the example in `sel`:
```python
import numpy
import xarray
da = xarray.DataArray(
numpy.arange(25).reshape(5, 5),
coords={""x"": numpy.arange(5), ""y"": numpy.arange(5)},
dims=(""x"", ""y""),
)
tgt_x = numpy.linspace(0, 4, num=5) + 0.5
tgt_y = numpy.linspace(0, 4, num=5) + 0.5
da = da.reindex(
x=tgt_x, y=tgt_y, method=""nearest"", tolerance=0.2, fill_value=numpy.nan
).sel(
x=xarray.DataArray(tgt_x, dims=""points""),
y=xarray.DataArray(tgt_y, dims=""points""),
)
```
Output:
```
array([nan, nan, nan, nan, nan])
Coordinates:
x (points) float64 0.5 1.5 2.5 3.5 4.5
y (points) float64 0.5 1.5 2.5 3.5 4.5
Dimensions without coordinates: points
```
Side note: I don't think it makes sense to add `fill_value` to `sel` as it would require adding new coordinates that didn't exist previously. Calling `reindex` first makes that more clear in my opinion.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-799047819,https://api.github.com/repos/pydata/xarray/issues/4995,799047819,MDEyOklzc3VlQ29tbWVudDc5OTA0NzgxOQ==,43613877,2021-03-15T02:28:51Z,2021-03-15T02:28:51Z,CONTRIBUTOR,"Thanks @dcherian, this is doing the job. I'll close this issue as there seems to be no need to implement this into the `sel` method.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-791145448,https://api.github.com/repos/pydata/xarray/issues/4995,791145448,MDEyOklzc3VlQ29tbWVudDc5MTE0NTQ0OA==,2448579,2021-03-05T04:32:29Z,2021-03-05T04:32:29Z,MEMBER,"Actually does `reindex` do what you want, the returned coordinate labels will be what you provide.
```
>>> ds.reindex(lat=[5,15,40], method=""nearest"", tolerance=5, fill_value=-999)
array([1, 2, -999])
Coordinates:
* lat (lat) int64 5 15 40
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-791021835,https://api.github.com/repos/pydata/xarray/issues/4995,791021835,MDEyOklzc3VlQ29tbWVudDc5MTAyMTgzNQ==,2448579,2021-03-04T23:16:00Z,2021-03-04T23:16:00Z,MEMBER,"> in using a fill_value is that the indexing has to modify the data ( insert e.g. -999) and also 'invent' a new coordinate point ( here 40).
This seems totally doable though.
> One fill_value might not fit to all data arrays
In quite a few functions, fill_value can be a dict mapping variable name to a value so this is workable.
Let's see what others think.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-791019238,https://api.github.com/repos/pydata/xarray/issues/4995,791019238,MDEyOklzc3VlQ29tbWVudDc5MTAxOTIzOA==,43613877,2021-03-04T23:10:11Z,2021-03-04T23:10:11Z,CONTRIBUTOR,"Introducing a `fill_value` seems like a good idea, such that the size of the output does not change compared to the intended selection.
Choosing the original/requested coordinate as a label for the missing datapoint seems to be a valid choice because this position has been checked for valid data nearby without success.
I would suggest, that the `fill_value` should then be automatically determined from the `_FillValue`, the datatype and only at last requires the `fill_value` to be set.
However, the shortcoming that I see in using a `fill_value` is that the indexing has to modify the data ( insert e.g. `-999`) and also 'invent' a new coordinate point ( here `40`). This gets reasonably complex, when applying to a dataset with DataArrays of different types, e.g.
```python
import numpy as np
import xarray as xr
ds = xr.Dataset()
ds['data1'] = xr.DataArray(np.array([1,2,3,4,5], dtype=int), dims=[""lat""], coords={'lat':[10,20,30,50,60]})
ds['data2'] = xr.DataArray(np.array([1,2,3,4,5], dtype=float), dims=[""lat""], coords={'lat':[10,20,30,50,60]})
```
One `fill_value` might not fit to all data arrays being it because of the datatype or the actual data. E.g. `-999` might be a good `fill_value` for one DataArray but a valid datapoint in another one.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976
https://github.com/pydata/xarray/issues/4995#issuecomment-790878651,https://api.github.com/repos/pydata/xarray/issues/4995,790878651,MDEyOklzc3VlQ29tbWVudDc5MDg3ODY1MQ==,2448579,2021-03-04T19:40:29Z,2021-03-04T19:40:29Z,MEMBER,"```
>>> ds.sel(lat=[5,15,40], method=""nearest"", tolerance=5)
array([1, 2])
Coordinates:
* lat (lat) int64 10 20
```
This is a very surprising result, you've asked for values at three points but received two back.
The following (specifying `fill_value`) seems like better behaviour to me but how do you choose the coordinate label (here I picked `40` since that was provided to `sel`)
```
>>> ds.sel(lat=[5,15,40], method=""nearest"", tolerance=5, fill_value=-999)
array([1, 2, -999])
Coordinates:
* lat (lat) int64 10 20 40
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,822320976