html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3751#issuecomment-602221579,https://api.github.com/repos/pydata/xarray/issues/3751,602221579,MDEyOklzc3VlQ29tbWVudDYwMjIyMTU3OQ==,6628425,2020-03-22T15:01:35Z,2020-03-22T15:01:35Z,MEMBER,Thanks @jbrockmendel. I didn't realize you had a few downstream tests; that's great. See https://github.com/pandas-dev/pandas/pull/32905.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-602053723,https://api.github.com/repos/pydata/xarray/issues/3751,602053723,MDEyOklzc3VlQ29tbWVudDYwMjA1MzcyMw==,6628425,2020-03-21T14:35:11Z,2020-03-21T14:35:33Z,MEMBER,"Thanks @jbrockmendel; it's great to see that pandas-dev/pandas#32684 was merged. Regarding #3764, I gave things a try with pandas master and removing our overrides to `_get_nearest_indexer` and `_filter_indexer_tolerance`. I got similar failures to what we were getting before.
Example failure

``` _______________________________ test_sel_date_scalar_nearest[proleptic_gregorian-sel_kwargs2] _______________________________ da = array([1, 2, 3, 4]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-02-01 00:00:00 date_type = index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00, 0002-02-01 00:00:00], dtype='object') sel_kwargs = {'method': 'nearest', 'tolerance': datetime.timedelta(days=1800000)} @requires_cftime @pytest.mark.parametrize( ""sel_kwargs"", [ {""method"": ""nearest""}, {""method"": ""nearest"", ""tolerance"": timedelta(days=70)}, {""method"": ""nearest"", ""tolerance"": timedelta(days=1800000)}, ], ) def test_sel_date_scalar_nearest(da, date_type, index, sel_kwargs): expected = xr.DataArray(2).assign_coords(time=index[1]) > result = da.sel(time=date_type(1, 4, 1), **sel_kwargs) test_cftimeindex.py:471: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../core/dataarray.py:1061: in sel **indexers_kwargs, ../core/dataset.py:2066: in sel self, indexers=indexers, method=method, tolerance=tolerance ../core/coordinates.py:392: in remap_label_indexers obj, v_indexers, method=method, tolerance=tolerance ../core/indexing.py:270: in remap_label_indexers idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) ../core/indexing.py:190: in convert_label_indexer label.item(), method=method, tolerance=tolerance ../coding/cftimeindex.py:365: in get_loc return pd.Index.get_loc(self, key, method=method, tolerance=tolerance) ../../../pandas/pandas/core/indexes/base.py:2874: in get_loc indexer = self.get_indexer([key], method=method, tolerance=tolerance) ../../../pandas/pandas/core/indexes/base.py:2967: in get_indexer indexer = self._get_nearest_indexer(target, limit, tolerance) ../../../pandas/pandas/core/indexes/base.py:3062: in _get_nearest_indexer indexer = self._filter_indexer_tolerance(target, indexer, tolerance) ../../../pandas/pandas/core/indexes/base.py:3069: in _filter_indexer_tolerance indexer = np.where(distance <= tolerance, indexer, -1) ../../../pandas/pandas/core/indexes/extension.py:129: in wrapper return op(other) ../../../pandas/pandas/core/ops/common.py:63: in new_method return method(self, other) ../../../pandas/pandas/core/arrays/datetimelike.py:75: in wrapper other = self._scalar_type(other) pandas/_libs/tslibs/timedeltas.pyx:1233: in pandas._libs.tslibs.timedeltas.Timedelta.__new__ ??? pandas/_libs/tslibs/timedeltas.pyx:209: in pandas._libs.tslibs.timedeltas.convert_to_timedelta64 ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E OverflowError: Python int too large to convert to C long pandas/_libs/tslibs/timedeltas.pyx:154: OverflowError ```

In my testing, I can only get things to work if the argument to `abs` (or `np.abs`) is a NumPy array (instead of an Index). An upstream change [like this](https://github.com/pandas-dev/pandas/compare/master...spencerkclark:cftime-nearest-fix) would work (it still maintains the behavior desired from pandas-dev/pandas#31511), but I'm not sure how palatable it would be. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-599231986,https://api.github.com/repos/pydata/xarray/issues/3751,599231986,MDEyOklzc3VlQ29tbWVudDU5OTIzMTk4Ng==,6628425,2020-03-15T16:24:07Z,2020-03-15T16:24:07Z,MEMBER,"Thanks so much @jbrockmendel for looking into the `__getitem__` issue upstream. That should be the last of the issues that remains from this thread. As you probably noticed, we ended up merging #3764, which fixed indexing with the ""nearest"" method. Once pydata/pandas#32684 is merged, we should be able to un-xfail the Series `__getitem__` tests on our end.","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 1, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-597888734,https://api.github.com/repos/pydata/xarray/issues/3751,597888734,MDEyOklzc3VlQ29tbWVudDU5Nzg4ODczNA==,6628425,2020-03-11T21:32:53Z,2020-03-11T21:32:53Z,MEMBER,Thanks @jbrockmendel -- I'll try to do that this weekend.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-592603121,https://api.github.com/repos/pydata/xarray/issues/3751,592603121,MDEyOklzc3VlQ29tbWVudDU5MjYwMzEyMQ==,6628425,2020-02-28T16:56:25Z,2020-02-28T16:56:25Z,MEMBER,"Thanks @jbrockmendel -- I think there are two separate issues: - Indexing with `__getitem__` in a Series would be solved by the `is_scalar` fix. - Indexing with the ""nearest"" method would require reverting pandas-dev/pandas#31511 for non-Datetime-or-Period-Indexes.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-592487396,https://api.github.com/repos/pydata/xarray/issues/3751,592487396,MDEyOklzc3VlQ29tbWVudDU5MjQ4NzM5Ng==,6628425,2020-02-28T12:13:27Z,2020-02-28T12:13:27Z,MEMBER,"@jbrockmendel @TomAugspurger it turns out that fixing indexing with the ""nearest"" method without overriding private methods of `pandas.Index` is harder than I expected within xarray alone (see https://github.com/pydata/xarray/pull/3764#issuecomment-586597512 for more details). Do you think an upstream fix would be acceptable here? My understanding is that the issue that prompted the change in pandas only pertained to DatetimeIndexes (or perhaps maybe also PeriodIndexes in the future); would it be possible to limit the scope of the updates in pandas-dev/pandas#31511 to just those?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-587438020,https://api.github.com/repos/pydata/xarray/issues/3751,587438020,MDEyOklzc3VlQ29tbWVudDU4NzQzODAyMA==,6628425,2020-02-18T12:32:40Z,2020-02-18T12:32:40Z,MEMBER,"Another kind of failure came up in the context of indexing a Series with a `cftime.datetime` object:
Example failure

``` ____________________________ test_indexing_in_series_getitem[365_day] _____________________________ series = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00, 0002-02-01 00:00:00], dtype='object') scalar_args = [cftime.DatetimeNoLeap(0001-01-01 00:00:00)] range_args = ['0001', slice('0001-01-01', '0001-12-30', None), slice(None, '0001-12-30', None), slice(cftime.DatetimeNoLeap(0001-01...:00), cftime.DatetimeNoLeap(0001-12-30 00:00:00), None), slice(None, cftime.DatetimeNoLeap(0001-12-30 00:00:00), None)] @requires_cftime def test_indexing_in_series_getitem(series, index, scalar_args, range_args): for arg in scalar_args: > assert series[arg] == 1 test_cftimeindex.py:597: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../../pandas/pandas/core/series.py:884: in __getitem__ return self._get_with(key) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 key = cftime.DatetimeNoLeap(0001-01-01 00:00:00) def _get_with(self, key): # other: fancy integer or otherwise if isinstance(key, slice): # _convert_slice_indexer to determing if this slice is positional # or label based, and if the latter, convert to positional slobj = self.index._convert_slice_indexer(key, kind=""getitem"") return self._slice(slobj) elif isinstance(key, ABCDataFrame): raise TypeError( ""Indexing a Series with DataFrame is not "" ""supported, use the appropriate DataFrame column"" ) elif isinstance(key, tuple): try: return self._get_values_tuple(key) except ValueError: # if we don't have a MultiIndex, we may still be able to handle # a 1-tuple. see test_1tuple_without_multiindex if len(key) == 1: key = key[0] if isinstance(key, slice): return self._get_values(key) raise if not isinstance(key, (list, np.ndarray, ExtensionArray, Series, Index)): > key = list(key) E TypeError: 'cftime._cftime.DatetimeNoLeap' object is not iterable ../../../pandas/pandas/core/series.py:911: TypeError ```

Admittedly I think most people probably use a CFTimeIndex within xarray data structures, but it would be nice to maintain some ability to use it in pandas data structures too. This issue stems from the changes made in https://github.com/pandas-dev/pandas/pull/31399. I think the problem is that `pandas.core.dtypes.common.is_scalar` returns `False` for a `cftime.datetime` object: ``` In [1]: import cftime In [2]: import pandas In [3]: pandas.core.dtypes.common.is_scalar(cftime.DatetimeNoLeap(2000, 1, 1)) Out[3]: False ``` Could there be a simple upstream fix for this?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-583367035,https://api.github.com/repos/pydata/xarray/issues/3751,583367035,MDEyOklzc3VlQ29tbWVudDU4MzM2NzAzNQ==,14808389,2020-02-07T12:19:25Z,2020-02-07T12:19:25Z,MEMBER,"no, that's my bad, it is pretty clear but I seem to have skipped over it","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-583366068,https://api.github.com/repos/pydata/xarray/issues/3751,583366068,MDEyOklzc3VlQ29tbWVudDU4MzM2NjA2OA==,6628425,2020-02-07T12:16:13Z,2020-02-07T12:16:13Z,MEMBER,"> I think the issue here is that other is a pandas.Index instead of a CFTimeIndex. Yes, I noted that in my original post (sorry if that wasn't clear).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-583364958,https://api.github.com/repos/pydata/xarray/issues/3751,583364958,MDEyOklzc3VlQ29tbWVudDU4MzM2NDk1OA==,14808389,2020-02-07T12:12:33Z,2020-02-07T12:12:33Z,MEMBER,"when the tests fail, `other` is a `pandas.Index` containing whatever has been used to index. The subtraction results in a normal `TimedeltaIndex` which is then passed to the `CFTimeIndex`: ``` np.array(self) = array([cftime.DatetimeNoLeap(0001-02-01 00:00:00)], dtype=object) other = Index([0001-05-01 00:00:00], dtype='object') other[0] = cftime.DatetimeNoLeap(0001-05-01 00:00:00) np.array(self) - other = TimedeltaIndex(['-89 days'], dtype='timedelta64[ns]', freq=None) ``` I think the issue here is that `other` is a `pandas.Index` instead of a `CFTimeIndex`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-583355144,https://api.github.com/repos/pydata/xarray/issues/3751,583355144,MDEyOklzc3VlQ29tbWVudDU4MzM1NTE0NA==,6628425,2020-02-07T11:40:44Z,2020-02-07T11:40:44Z,MEMBER,"> FWIW, I think @jbrockmendel is still progressing on an ""extension index"" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution. Nice -- I look forward to being able to try that out! > any idea what `other` is here? looks like it might be a DatetimeIndex @jbrockmendel agreed that it's unclear -- we probably should have written the code for that method in a clearer way. I think it's mainly used for subtracting a single `datetime.timedelta` or NumPy array of `datetime.timedelta` objects from a `CFTimeIndex`. In both of those cases we would expect the result to remain a `CFTimeIndex`. `cftime.datetime` objects often represent dates from non-standard calendars (e.g. calendars with no leap year, or calendars where all months have 30 days), so in general they are not compatible with the dates used in a DatetimeIndex. Subtracting like-calendar `cftime.datetime` objects is fair game though, and we'd like that to produce timedeltas.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-582972083,https://api.github.com/repos/pydata/xarray/issues/3751,582972083,MDEyOklzc3VlQ29tbWVudDU4Mjk3MjA4Mw==,1312546,2020-02-06T15:55:30Z,2020-02-06T15:55:30Z,MEMBER,"FWIW, I think @jbrockmendel is still progressing on an ""extension index"" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution. Short-term, I'm less sure what's best.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-582957727,https://api.github.com/repos/pydata/xarray/issues/3751,582957727,MDEyOklzc3VlQ29tbWVudDU4Mjk1NzcyNw==,2448579,2020-02-06T15:26:27Z,2020-02-06T15:26:27Z,MEMBER,Thanks for narrowing it down @spencerkclark . Let's see what @TomAugspurger thinks about an xarray workaround vs a pandas fix.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728 https://github.com/pydata/xarray/issues/3751#issuecomment-582904457,https://api.github.com/repos/pydata/xarray/issues/3751,582904457,MDEyOklzc3VlQ29tbWVudDU4MjkwNDQ1Nw==,6628425,2020-02-06T13:25:47Z,2020-02-06T13:34:51Z,MEMBER,"Part of the hazard of using a `pd.Index` subclass I suppose... It looks like https://github.com/pandas-dev/pandas/pull/31511 was the cause of the issue: ``` $ git bisect good 64336ff8414f8977ff94adb9a5bc000a3a4ef454 is the first bad commit commit 64336ff8414f8977ff94adb9a5bc000a3a4ef454 Author: Kevin Anderson <57452607+kanderso-nrel@users.noreply.github.com> Date: Sun Feb 2 20:48:28 2020 -0700 BUG: fix reindexing with a tz-aware index and method='nearest' (#31511) doc/source/whatsnew/v1.1.0.rst | 2 +- pandas/core/indexes/base.py | 5 ++--- pandas/tests/frame/indexing/test_indexing.py | 10 ++++++++++ 3 files changed, 13 insertions(+), 4 deletions(-) ``` A way to fix this upstream would be to make sure that `target` has the same type as the index (here it is a generic `pd.Index` instead of a `CFTimeIndex`), but I'm not sure how hard that would be (or if it makes sense in all cases): https://github.com/pandas-dev/pandas/blob/a2a35a86c4064d297c8b48ecfea80e9f05e27712/pandas/core/indexes/base.py#L3080 I think it's possible we could work around this in xarray. It comes down to properly recognizing what to do when you subtract a generic `pd.Index` of `cftime.datetime` objects from a `CFTimeIndex`. Previously this code in pandas operated strictly using NumPy arrays, so there was no casting issue when doing the subtraction. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,559873728