html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/2593#issuecomment-459998025,https://api.github.com/repos/pydata/xarray/issues/2593,459998025,MDEyOklzc3VlQ29tbWVudDQ1OTk5ODAyNQ==,6628425,2019-02-02T20:49:30Z,2019-02-02T20:49:30Z,MEMBER,"Sounds good @shoyer, thanks for bringing this to the finish line.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-456513616,https://api.github.com/repos/pydata/xarray/issues/2593,456513616,MDEyOklzc3VlQ29tbWVudDQ1NjUxMzYxNg==,6628425,2019-01-22T18:39:37Z,2019-01-22T18:39:37Z,MEMBER,"One more optional thing -- support for the `loffset` keyword was recently added for DatetimeIndex resampling (https://github.com/pydata/xarray/pull/2608).  This just involves a simple adjustment to the bin labels (the index of the `first_items` Series).  You could implement that here, or for now just raise a `NotImplementedError` for resampling with a CFTimeIndex if `loffset` is not `None`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455790039,https://api.github.com/repos/pydata/xarray/issues/2593,455790039,MDEyOklzc3VlQ29tbWVudDQ1NTc5MDAzOQ==,6628425,2019-01-19T15:33:43Z,2019-01-19T15:38:08Z,MEMBER,"> 5808 of 5920 tests now pass and the remaining 112 are ignored due to ValueError: ""value falls before first bin"".

That's great news!

> I think writing targeted unit tests are the last thing on the agenda, so I'll get right on that.

Before spending too much time on that just yet, see if you can resolve the merge conflicts, and if you can think about a way to reduce the length of the existing tests.  It would be helpful to see a coverage report generated by [coveralls](https://coveralls.io/github/pydata/xarray) for the new logic you've added (if you resolve the merge conflicts our CI here will run and we'll be able to see that automatically).  Maybe start by commenting out a bunch of the really long tests and see where things stand?

Then we can think about how to add coverage back in as needed.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455662721,https://api.github.com/repos/pydata/xarray/issues/2593,455662721,MDEyOklzc3VlQ29tbWVudDQ1NTY2MjcyMQ==,6628425,2019-01-18T19:35:14Z,2019-01-18T19:35:14Z,MEMBER,@jwenfai I created an issue in cftime regarding this question: Unidata/cftime#109.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455619034,https://api.github.com/repos/pydata/xarray/issues/2593,455619034,MDEyOklzc3VlQ29tbWVudDQ1NTYxOTAzNA==,6628425,2019-01-18T17:09:26Z,2019-01-18T18:23:19Z,MEMBER,"> The first and last values are returned by _adjust_bin_anchored when isinstance(offset, CFTIME_TICKS). Since date subtraction happens within _adjust_bin_anchored, some test cases have imprecise first and last values.

Ah indeed; this makes sense now.  Maybe we should bring this up in cftime to see what their recommendation might be?  I could imagine writing a function like this that would correct for this imprecision when taking the difference between two dates:
```python
from datetime import timedelta

def exact_cftime_datetime_difference(a, b):
    """"""Exact computation of b - a""""""
    seconds = b.replace(microsecond=0) - a.replace(microsecond=0)
    seconds = int(round(seconds.total_seconds()))
    microseconds = b.microsecond - a.microsecond
    return timedelta(seconds=seconds, microseconds=microseconds)
```

Here are a couple test cases:
```python
import cftime
from datetime import datetime

# Testing with cftime version 1.0.0 where datetime where I can
# reproduce precision issues.
test_cases = [
    [(2000, 1, 1, 0, 4, 0, 956321), (1892, 1, 3, 12, 0, 0, 112123)],
    [(2000, 1, 1, 0, 4, 0, 1), (1892, 1, 3, 12, 0, 0, 503432)],
    [(2000, 1, 1, 0, 4, 0, 999999), (1892, 1, 3, 12, 0, 0, 112123)],
    [(2000, 1, 1, 0, 4, 0, 11213), (1892, 1, 3, 12, 0, 0, 77777)],
]
for a_args, b_args in test_cases:
    a_cftime = cftime.DatetimeGregorian(*a_args)
    b_cftime = cftime.DatetimeGregorian(*b_args)
    a = datetime(*a_args)
    b = datetime(*b_args)

    expected = b - a
    result = exact_cftime_datetime_difference(a_cftime, b_cftime)
    assert result == expected
    
    inexact = b_cftime - a_cftime
    assert inexact != expected
    
    # Test other direction
    expected = a - b
    result = exact_cftime_datetime_difference(b_cftime, a_cftime)
    assert result == expected
```
But maybe I'm missing something important.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455540080,https://api.github.com/repos/pydata/xarray/issues/2593,455540080,MDEyOklzc3VlQ29tbWVudDQ1NTU0MDA4MA==,6628425,2019-01-18T13:07:59Z,2019-01-18T13:07:59Z,MEMBER,"@jwenfai to provide some more detail for my confusion here -- my impression is that adding or subtracting a `datetime.timedelta` object from a `cftime.datetime` object to produce another `cftime.datetime` object should always be microsecond-exact.  [See comments in cftime here](https://github.com/Unidata/cftime/blob/ca6ddce6fbfddfb769de6e79947a4af4b471fc93/cftime/_cftime.pyx#L1644-L1657).  This is how a CFTimeIndex is constructed through `cftime_range`, so at least naively I would not anticipate any precision issues in constructing the bins.  

Taking the difference between two dates to produce a timedelta object takes [a different code path](https://github.com/Unidata/cftime/blob/master/cftime/_cftime.pyx#L1350) in cftime, which is not microsecond-precise.  This, as we've seen, can induce some small errors in interpolation (because in the process we need to determine the amount of time between each date in the time coordinate and a reference date).

> Of the ones I've inspected, the resampled cftime array always has 1 more bin than pandas

Could you provide more details about this example?  What were the resample parameters used (e.g. `freq`, `closed`, `label`, `base`)?  Is the extra bin added to the beginning or end of the time range?  If you convert to a DatetimeIndex, do all the other bins match exactly, or is there some error?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455390725,https://api.github.com/repos/pydata/xarray/issues/2593,455390725,MDEyOklzc3VlQ29tbWVudDQ1NTM5MDcyNQ==,6628425,2019-01-18T01:09:59Z,2019-01-18T01:34:59Z,MEMBER,"> So I keep pandas' logic (the first bin has 1 day and 1 microsecond added to it) and raise a Value Error when either index[0] < datetime_bins[0] or index[lenidx - 1] > datetime_bins[lenbin - 1].

Exactly 👍

> I'm testing against the dev version, 11 commits behind. Could the errors for XT that I get but you don't be due to cftime/linear algebra library issue? There may be enough error accumulated for hourly frequencies over 140 years that cftime_range generates an extra bin compared to pandas date_range (haven't checked all manually but I believe the majority of the non-""values falls before first bin"" errors are due to extra bin(s)). 6AS_JUN only has 8 failed tests all due to ""x and y nan location mismatch"".

~~I think the ""values falls before first bin"" errors are all from pandas, where datetime arithmetic is exact, so they could not be due to cftime, right?~~  I'll take a look at the 6AS-JUN tests.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455393260,https://api.github.com/repos/pydata/xarray/issues/2593,455393260,MDEyOklzc3VlQ29tbWVudDQ1NTM5MzI2MA==,6628425,2019-01-18T01:23:20Z,2019-01-18T01:23:20Z,MEMBER,"> Oh no, I meant that except for all the ""values falls before first bin"" errors, most (if not all) of the errors are due to shape mismatch between the resampled cftime and pandas arrays. Of the ones I've inspected, the resampled cftime array always has 1 more bin than pandas

Oops, my bad, I should have read more carefully!  I'll think about this more.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-455391923,https://api.github.com/repos/pydata/xarray/issues/2593,455391923,MDEyOklzc3VlQ29tbWVudDQ1NTM5MTkyMw==,6628425,2019-01-18T01:16:21Z,2019-01-18T01:17:10Z,MEMBER,"> We probably don't. I forget the reason but early on in development, resampling tests failed for some time ranges when using purely odd frequencies while others failed with purely even ones. Resampling tests for12H/24H frequencies might not be needed now that `_adjust_bin_edges' is being used.

Yeah this has for sure been helpful for development -- I did not expect to encounter the ""values falls before first bin"" error, but clearly it seems we do need to worry about it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-450740815,https://api.github.com/repos/pydata/xarray/issues/2593,450740815,MDEyOklzc3VlQ29tbWVudDQ1MDc0MDgxNQ==,6628425,2019-01-01T16:18:40Z,2019-01-01T16:18:40Z,MEMBER,"> I'm on a break right now and I'll look more closely at the alternative solution when I'm back, but from what I've read in your comment the solution makes sense.

No worries!  I just wanted to get these thoughts down when I had some time.  No rush to make any updates.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-450533035,https://api.github.com/repos/pydata/xarray/issues/2593,450533035,MDEyOklzc3VlQ29tbWVudDQ1MDUzMzAzNQ==,6628425,2018-12-30T01:31:12Z,2018-12-30T01:31:12Z,MEMBER,"So the crux of the problem now seems to be in generating `first_items`; this is the Series that is used for both upsampling and downsampling a DataArray in xarray.  For data indexed by a DatetimeIndex, it is straightforward to generate this Series (it just takes the construction of a simple Series with `np.arange` and the reference index, the construction of a `pandas.Grouper` object, and a call to `groupby` with the method `first`).  In xarray, downsampling uses both the values (to define the groups) and index (to define the labels) of this Series, while upsampling only uses the index.

For data indexed by a CFTimeIndex, we do not have the luxury of a formal `Grouper` object; however, if we can create this `first_items` Series accurately, I think all other results of resample in xarray should follow.

I've put together a [gist](https://gist.github.com/spencerkclark/27ef5254434fc5c90e2ebbdd975930b9#file-test_first_items_current-py) which compares the `first_items` Series generated with pandas with that generated by the cftime logic (the [output](https://gist.github.com/spencerkclark/27ef5254434fc5c90e2ebbdd975930b9#file-test_first_items_current-out) of running the tests is also included).  I've tried to use a fairly challenging set of initial time indexes as well as resample frequencies (different than what are currently used in the tests); there appear to be many mismatches under the ""upsampling"" case, but also a few errors show up in the ""downsampling"" case (to some extent I think these are related to the omission of the `_adjust_bin_edges` method, which it turns out I do think we may need).  In theory though, because of how this `first_items` Series is created in the DatetimeIndex case, I don't think the way we create it in the CFTimeIndex case should depend on whether the length of the reference index is greater than or less than the length of the resample labels (upsampling or downsampling is determined instead by the resampling method used).

This inspired the alternative solution proposed in the [second part of the gist](https://gist.github.com/spencerkclark/27ef5254434fc5c90e2ebbdd975930b9#file-test_first_items_simplified-py) (I've also added back in a call to a cftime version of the `_adjust_bin_edges` method); in this case there is no dependence on the relative lengths of the reference index and resample labels, and all of the test cases I've tried so far pass.

Let me know if this alternative solution makes sense.  Digging in to the guts of the resample code in pandas/xarray is still fairly new for me too, so I could be missing something.  In the gist I'm using this branch of xarray, the development version of pandas, and the latest version of cftime.  Thanks again for your hard work on this!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-449635856,https://api.github.com/repos/pydata/xarray/issues/2593,449635856,MDEyOklzc3VlQ29tbWVudDQ0OTYzNTg1Ng==,6628425,2018-12-23T13:16:41Z,2018-12-23T13:16:41Z,MEMBER,"> If pandas master with the altered resampling logic will be the definitive version going forward, should development of CFTimeIndex resampling be suspended until this version of pandas master is released and xarray uses it as a dependency?

I think it is OK to continue working here, with the aim of implementing resample for cftime that is closest to the most up-to-date pandas version.  We can add code to skip tests that depend on this behavior if the version of pandas is not recent enough (see an example of doing this [here](https://github.com/pydata/xarray/blob/2223445905705162053756323dfd6e6f4527b270/xarray/tests/test_dask.py#L848-L849)).  As we're working before version 0.24 of pandas is released we can keep an eye on our ""py36-pandas-dev"" CI build listed under the ""Allowed Failures"" section in Travis.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-449534277,https://api.github.com/repos/pydata/xarray/issues/2593,449534277,MDEyOklzc3VlQ29tbWVudDQ0OTUzNDI3Nw==,6628425,2018-12-22T01:19:42Z,2018-12-22T01:19:42Z,MEMBER,"So I think the way you have things written now in `_get_time_bins` makes sense, though this subtle change in behavior on the pandas side makes testing against different releases of pandas a little trickier.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-449533135,https://api.github.com/repos/pydata/xarray/issues/2593,449533135,MDEyOklzc3VlQ29tbWVudDQ0OTUzMzEzNQ==,6628425,2018-12-22T01:04:32Z,2018-12-22T01:04:32Z,MEMBER,"Yes, I noticed this too.  I think it's related to changes made here: https://github.com/pandas-dev/pandas/pull/24347.  At least in the test cases that I've run, I've only seen it make a difference in the NaN placement at the end of the time series.

For example with pandas 0.23.4:
```
In [1]: import xarray as xr; import pandas as pd

In [2]: nptimes = pd.date_range('2000', periods=2000)

In [3]: nptime_da = xr.DataArray(range(2000), [('time', nptimes)])

In [4]: nptime_da.resample(time='4AS').mean('time')
Out[4]:
<xarray.DataArray (time: 3)>
array([ 730., 1730.,   nan])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 2004-01-01 2008-01-01
```
and with pandas master:
```
In [4]: nptime_da.resample(time='4AS').mean('time')
Out[4]:
<xarray.DataArray (time: 2)>
array([ 730., 1730.])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 2004-01-01
```
I feel like the result under pandas master makes more sense, given the input array, whose final date is 2005-06-22.  Adding an additional bin labeled 2008-01-01 to the resampled time series seems superfluous given that with the default `label='left'` it's clear no data from the original time series would fall in that bin.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616
https://github.com/pydata/xarray/pull/2593#issuecomment-448443826,https://api.github.com/repos/pydata/xarray/issues/2593,448443826,MDEyOklzc3VlQ29tbWVudDQ0ODQ0MzgyNg==,6628425,2018-12-19T02:11:03Z,2018-12-19T02:11:03Z,MEMBER,"@jwenfai thanks for the updates.  It looks like there are some merge conflicts that are preventing our CI from running.  Could you please resolve those when you get chance, so we can see those results?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,387924616