html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3751#issuecomment-602221579,https://api.github.com/repos/pydata/xarray/issues/3751,602221579,MDEyOklzc3VlQ29tbWVudDYwMjIyMTU3OQ==,6628425,2020-03-22T15:01:35Z,2020-03-22T15:01:35Z,MEMBER,Thanks @jbrockmendel.  I didn't realize you had a few downstream tests; that's great.  See https://github.com/pandas-dev/pandas/pull/32905.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-602053723,https://api.github.com/repos/pydata/xarray/issues/3751,602053723,MDEyOklzc3VlQ29tbWVudDYwMjA1MzcyMw==,6628425,2020-03-21T14:35:11Z,2020-03-21T14:35:33Z,MEMBER,"Thanks @jbrockmendel; it's great to see that pandas-dev/pandas#32684 was merged.  

Regarding #3764, I gave things a try with pandas master and removing our overrides to `_get_nearest_indexer` and `_filter_indexer_tolerance`.  I got similar failures to what we were getting before.

<details><summary>Example failure</summary>
<p>

```
_______________________________ test_sel_date_scalar_nearest[proleptic_gregorian-sel_kwargs2] _______________________________

da = <xarray.DataArray (time: 4)>
array([1, 2, 3, 4])
Coordinates:
  * time     (time) object 0001-01-01 00:00:00 ... 0002-02-01 00:00:00
date_type = <class 'cftime._cftime.DatetimeProlepticGregorian'>
index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00,
             0002-02-01 00:00:00],
            dtype='object')
sel_kwargs = {'method': 'nearest', 'tolerance': datetime.timedelta(days=1800000)}

    @requires_cftime
    @pytest.mark.parametrize(
        ""sel_kwargs"",
        [
            {""method"": ""nearest""},
            {""method"": ""nearest"", ""tolerance"": timedelta(days=70)},
            {""method"": ""nearest"", ""tolerance"": timedelta(days=1800000)},
        ],
    )
    def test_sel_date_scalar_nearest(da, date_type, index, sel_kwargs):
        expected = xr.DataArray(2).assign_coords(time=index[1])
>       result = da.sel(time=date_type(1, 4, 1), **sel_kwargs)

test_cftimeindex.py:471:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../core/dataarray.py:1061: in sel
    **indexers_kwargs,
../core/dataset.py:2066: in sel
    self, indexers=indexers, method=method, tolerance=tolerance
../core/coordinates.py:392: in remap_label_indexers
    obj, v_indexers, method=method, tolerance=tolerance
../core/indexing.py:270: in remap_label_indexers
    idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance)
../core/indexing.py:190: in convert_label_indexer
    label.item(), method=method, tolerance=tolerance
../coding/cftimeindex.py:365: in get_loc
    return pd.Index.get_loc(self, key, method=method, tolerance=tolerance)
../../../pandas/pandas/core/indexes/base.py:2874: in get_loc
    indexer = self.get_indexer([key], method=method, tolerance=tolerance)
../../../pandas/pandas/core/indexes/base.py:2967: in get_indexer
    indexer = self._get_nearest_indexer(target, limit, tolerance)
../../../pandas/pandas/core/indexes/base.py:3062: in _get_nearest_indexer
    indexer = self._filter_indexer_tolerance(target, indexer, tolerance)
../../../pandas/pandas/core/indexes/base.py:3069: in _filter_indexer_tolerance
    indexer = np.where(distance <= tolerance, indexer, -1)
../../../pandas/pandas/core/indexes/extension.py:129: in wrapper
    return op(other)
../../../pandas/pandas/core/ops/common.py:63: in new_method
    return method(self, other)
../../../pandas/pandas/core/arrays/datetimelike.py:75: in wrapper
    other = self._scalar_type(other)
pandas/_libs/tslibs/timedeltas.pyx:1233: in pandas._libs.tslibs.timedeltas.Timedelta.__new__
    ???
pandas/_libs/tslibs/timedeltas.pyx:209: in pandas._libs.tslibs.timedeltas.convert_to_timedelta64
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   OverflowError: Python int too large to convert to C long

pandas/_libs/tslibs/timedeltas.pyx:154: OverflowError
```

</p>

</details>

In my testing, I can only get things to work if the argument to `abs` (or `np.abs`) is a NumPy array (instead of an Index).  An upstream change [like this](https://github.com/pandas-dev/pandas/compare/master...spencerkclark:cftime-nearest-fix) would work (it still maintains the behavior desired from pandas-dev/pandas#31511), but I'm not sure how palatable it would be.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-599231986,https://api.github.com/repos/pydata/xarray/issues/3751,599231986,MDEyOklzc3VlQ29tbWVudDU5OTIzMTk4Ng==,6628425,2020-03-15T16:24:07Z,2020-03-15T16:24:07Z,MEMBER,"Thanks so much @jbrockmendel for looking into the `__getitem__` issue upstream.  That should be the last of the issues that remains from this thread.  As you probably noticed, we ended up merging #3764, which fixed indexing with the ""nearest"" method.

Once pydata/pandas#32684 is merged, we should be able to un-xfail the Series `__getitem__` tests on our end.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-597888734,https://api.github.com/repos/pydata/xarray/issues/3751,597888734,MDEyOklzc3VlQ29tbWVudDU5Nzg4ODczNA==,6628425,2020-03-11T21:32:53Z,2020-03-11T21:32:53Z,MEMBER,Thanks @jbrockmendel -- I'll try to do that this weekend.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-592603121,https://api.github.com/repos/pydata/xarray/issues/3751,592603121,MDEyOklzc3VlQ29tbWVudDU5MjYwMzEyMQ==,6628425,2020-02-28T16:56:25Z,2020-02-28T16:56:25Z,MEMBER,"Thanks @jbrockmendel -- I think there are two separate issues:
- Indexing with `__getitem__` in a Series would be solved by the `is_scalar` fix.
- Indexing with the ""nearest"" method would require reverting pandas-dev/pandas#31511 for non-Datetime-or-Period-Indexes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-592487396,https://api.github.com/repos/pydata/xarray/issues/3751,592487396,MDEyOklzc3VlQ29tbWVudDU5MjQ4NzM5Ng==,6628425,2020-02-28T12:13:27Z,2020-02-28T12:13:27Z,MEMBER,"@jbrockmendel @TomAugspurger it turns out that fixing indexing with the ""nearest"" method without overriding private methods of `pandas.Index` is harder than I expected within xarray alone (see https://github.com/pydata/xarray/pull/3764#issuecomment-586597512 for more details).  Do you think an upstream fix would be acceptable here?  My understanding is that the issue that prompted the change in pandas only pertained to DatetimeIndexes (or perhaps maybe also PeriodIndexes in the future); would it be possible to limit the scope of the updates in pandas-dev/pandas#31511 to just those?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-587438020,https://api.github.com/repos/pydata/xarray/issues/3751,587438020,MDEyOklzc3VlQ29tbWVudDU4NzQzODAyMA==,6628425,2020-02-18T12:32:40Z,2020-02-18T12:32:40Z,MEMBER,"Another kind of failure came up in the context of indexing a Series with a `cftime.datetime` object:  

<details><summary>Example failure</summary>
<p>

```
____________________________ test_indexing_in_series_getitem[365_day] _____________________________

series = 0001-01-01 00:00:00    1
0001-02-01 00:00:00    2
0002-01-01 00:00:00    3
0002-02-01 00:00:00    4
dtype: int64
index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00,
             0002-02-01 00:00:00],
            dtype='object')
scalar_args = [cftime.DatetimeNoLeap(0001-01-01 00:00:00)]
range_args = ['0001', slice('0001-01-01', '0001-12-30', None), slice(None, '0001-12-30', None), slice(cftime.DatetimeNoLeap(0001-01...:00), cftime.DatetimeNoLeap(0001-12-30 00:00:00), None), slice(None, cftime.DatetimeNoLeap(0001-12-30 00:00:00), None)]

    @requires_cftime
    def test_indexing_in_series_getitem(series, index, scalar_args, range_args):
        for arg in scalar_args:
>           assert series[arg] == 1

test_cftimeindex.py:597:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
../../../pandas/pandas/core/series.py:884: in __getitem__
    return self._get_with(key)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = 0001-01-01 00:00:00    1
0001-02-01 00:00:00    2
0002-01-01 00:00:00    3
0002-02-01 00:00:00    4
dtype: int64
key = cftime.DatetimeNoLeap(0001-01-01 00:00:00)

    def _get_with(self, key):
        # other: fancy integer or otherwise
        if isinstance(key, slice):
            # _convert_slice_indexer to determing if this slice is positional
            #  or label based, and if the latter, convert to positional
            slobj = self.index._convert_slice_indexer(key, kind=""getitem"")
            return self._slice(slobj)
        elif isinstance(key, ABCDataFrame):
            raise TypeError(
                ""Indexing a Series with DataFrame is not ""
                ""supported, use the appropriate DataFrame column""
            )
        elif isinstance(key, tuple):
            try:
                return self._get_values_tuple(key)
            except ValueError:
                # if we don't have a MultiIndex, we may still be able to handle
                #  a 1-tuple.  see test_1tuple_without_multiindex
                if len(key) == 1:
                    key = key[0]
                    if isinstance(key, slice):
                        return self._get_values(key)
                raise

        if not isinstance(key, (list, np.ndarray, ExtensionArray, Series, Index)):
>           key = list(key)
E           TypeError: 'cftime._cftime.DatetimeNoLeap' object is not iterable

../../../pandas/pandas/core/series.py:911: TypeError
```

</p>
</details>

Admittedly I think most people probably use a CFTimeIndex within xarray data structures, but it would be nice to maintain some ability to use it in pandas data structures too.  This issue stems from the changes made in https://github.com/pandas-dev/pandas/pull/31399.  I think the problem is that `pandas.core.dtypes.common.is_scalar` returns `False` for a `cftime.datetime` object:

```
In [1]: import cftime

In [2]: import pandas

In [3]: pandas.core.dtypes.common.is_scalar(cftime.DatetimeNoLeap(2000, 1, 1))
Out[3]: False
```

Could there be a simple upstream fix for this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-583366068,https://api.github.com/repos/pydata/xarray/issues/3751,583366068,MDEyOklzc3VlQ29tbWVudDU4MzM2NjA2OA==,6628425,2020-02-07T12:16:13Z,2020-02-07T12:16:13Z,MEMBER,"> I think the issue here is that other is a pandas.Index instead of a CFTimeIndex.

Yes, I noted that in my original post (sorry if that wasn't clear).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-583355144,https://api.github.com/repos/pydata/xarray/issues/3751,583355144,MDEyOklzc3VlQ29tbWVudDU4MzM1NTE0NA==,6628425,2020-02-07T11:40:44Z,2020-02-07T11:40:44Z,MEMBER,"> FWIW, I think @jbrockmendel is still progressing on an ""extension index"" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution.

Nice -- I look forward to being able to try that out!

> any idea what `other` is here? looks like it might be a DatetimeIndex

@jbrockmendel agreed that it's unclear -- we probably should have written the code for that method in a clearer way.  I think it's mainly used for subtracting a single `datetime.timedelta` or NumPy array of `datetime.timedelta` objects from a `CFTimeIndex`.  In both of those cases we would expect the result to remain a `CFTimeIndex`.  

`cftime.datetime` objects often represent dates from non-standard calendars (e.g. calendars with no leap year, or calendars where all months have 30 days), so in general they are not compatible with the dates used in a DatetimeIndex.  Subtracting like-calendar `cftime.datetime` objects is fair game though, and we'd like that to produce timedeltas.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728
https://github.com/pydata/xarray/issues/3751#issuecomment-582904457,https://api.github.com/repos/pydata/xarray/issues/3751,582904457,MDEyOklzc3VlQ29tbWVudDU4MjkwNDQ1Nw==,6628425,2020-02-06T13:25:47Z,2020-02-06T13:34:51Z,MEMBER,"Part of the hazard of using a `pd.Index` subclass I suppose...

It looks like https://github.com/pandas-dev/pandas/pull/31511 was the cause of the issue:
```
$ git bisect good
64336ff8414f8977ff94adb9a5bc000a3a4ef454 is the first bad commit
commit 64336ff8414f8977ff94adb9a5bc000a3a4ef454
Author: Kevin Anderson <57452607+kanderso-nrel@users.noreply.github.com>
Date:   Sun Feb 2 20:48:28 2020 -0700

    BUG: fix reindexing with a tz-aware index and method='nearest' (#31511)

 doc/source/whatsnew/v1.1.0.rst               |  2 +-
 pandas/core/indexes/base.py                  |  5 ++---
 pandas/tests/frame/indexing/test_indexing.py | 10 ++++++++++
 3 files changed, 13 insertions(+), 4 deletions(-)
```
A way to fix this upstream would be to make sure that `target` has the same type as the index (here it is a generic `pd.Index` instead of a `CFTimeIndex`), but I'm not sure how hard that would be (or if it makes sense in all cases):
https://github.com/pandas-dev/pandas/blob/a2a35a86c4064d297c8b48ecfea80e9f05e27712/pandas/core/indexes/base.py#L3080

I think it's possible we could work around this in xarray.  It comes down to properly recognizing what to do when you subtract a generic `pd.Index` of `cftime.datetime` objects from a `CFTimeIndex`.  Previously this code in pandas operated strictly using NumPy arrays, so there was no casting issue when doing the subtraction.

","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728