html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3535#issuecomment-554648042,https://api.github.com/repos/pydata/xarray/issues/3535,554648042,MDEyOklzc3VlQ29tbWVudDU1NDY0ODA0Mg==,6628425,2019-11-16T15:39:23Z,2019-11-16T15:39:23Z,MEMBER,"Thanks for raising this issue @mathause. In hindsight this does not surprise me. Pandas's strict use of nanosecond-resolution datetimes and timedeltas was part of the motivation for the `CFTimeIndex`. While convenient, because it allows us to re-use code already written in pandas, holding the result of the difference between two CFTimeIndexes in a `TimedeltaIndex` clearly prevents us from taking the difference between distant dates. Perhaps a more robust (yet more complex) solution for https://github.com/pydata/xarray/issues/2484 would be to write a version of a `TimedeltaIndex` that does not internally cast the timedeltas to type `np.timedelta64[ns]`, and rather leaves them as `datetime.timedelta` objects, which are the actual result of subtracting two sequences of `cftime.datetime` objects. Regarding the `combine_by_coords` issue, though, there might be an easier fix. Is there a reason that `first_items` is an `Index` of length-one Indexes? It's not clear to me why that needs to be the case. https://github.com/pydata/xarray/blob/56c16e4bf45a3771fd9acba76d802c0199c14519/xarray/core/combine.py#L91 It appears if we just select the first value of each index (i.e. a `cftime.datetime` object in this example), e.g. ```python first_items = pd.Index([index[0] for index in indexes]) ``` pandas's `rank` method works properly and `combine_by_coords` produces the correct result: ``` >>> xr.combine_by_coords([d1, d2, d3]).time array([cftime.DatetimeGregorian(4500, 12, 31, 0, 0, 0, 0, 4, 365), cftime.DatetimeGregorian(4600, 12, 31, 0, 0, 0, 0, 2, 365), cftime.DatetimeGregorian(5100, 12, 31, 0, 0, 0, 0, 0, 365)], dtype=object) Coordinates: * time (time) object 4500-12-31 00:00:00 ... 5100-12-31 00:00:00 ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,523037716 https://github.com/pydata/xarray/issues/3535#issuecomment-554317768,https://api.github.com/repos/pydata/xarray/issues/3535,554317768,MDEyOklzc3VlQ29tbWVudDU1NDMxNzc2OA==,10194086,2019-11-15T11:05:44Z,2019-11-15T11:08:29Z,MEMBER,"This happens in `xr.combinde_by_coords`. Note that the `OverflowError` is ""ignored in: `pandas._libs.algos.are_diff'"". So `xr.combinde_by_coords` can return a wrong dataset (although this does not happen silently): ``` python import xarray as xr i1 = xr.cftime_range(""4500-12-31"", periods=1) i2 = xr.cftime_range(""4600-12-31"", periods=1) i3 = xr.cftime_range(""5100-12-31"", periods=1) d1 = xr.DataArray([0], dims=(""time"", ), coords={""time"": (""time"", i1)}).to_dataset(name=""a"") d2 = xr.DataArray([1], dims=(""time"", ), coords={""time"": (""time"", i2)}).to_dataset(name=""a"") d3 = xr.DataArray([2], dims=(""time"", ), coords={""time"": (""time"", i3)}).to_dataset(name=""a"") xr.combine_by_coords([d1, d2, d3]).time ``` returns: ``` python array([cftime.DatetimeGregorian(4500-12-31 00:00:00), cftime.DatetimeGregorian(5100-12-31 00:00:00)], dtype=object) Coordinates: * time (time) object 4500-12-31 00:00:00 5100-12-31 00:00:00 ``` note how `d2` is missing. -------- Within `xr.combine_by_coords` the error happens here: https://github.com/pydata/xarray/blob/7b4a286f59bc7d60d4e4d03be65562ff63f9b111/xarray/core/combine.py#L98 ``` python import pandas as pd indexes = [i1, i2, i3] # the code from _infer_concat_order_from_coords first_items = pd.Index([index.take([0]) for index in indexes]) series = first_items.to_series() rank = series.rank(method=""dense"", ascending=ascending) order = rank.astype(int).values - 1 order >>> array([0, 1, 1]) ``` This causes the second item to be dropped. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,523037716