issue_comments: 454940009

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/pull/2593#issuecomment-454940009	https://api.github.com/repos/pydata/xarray/issues/2593	454940009	MDEyOklzc3VlQ29tbWVudDQ1NDk0MDAwOQ==	8708062	2019-01-16T21:02:18Z	2019-01-16T21:02:18Z	CONTRIBUTOR	Hi @spencerkclark, sorry it took so long to get back to you. I've implemented your simplified resampling logic. Some of the logic had to be altered since pandas have made updates. It's great not having to delineate between upsampling/downsampling cases! I ran into some issues though and I thought maybe an extra pair of eyes could help me diagnose them: cftime : Not really important but I cannot reproduce the results you obtained for cftime 1.0.3.4. I've tried Python 2.7 and 3.6, conda packages and also building from source, Windows machine and the Windows Ubuntu shell --- datetime arithmetic precision problem persists. To work around this issue, I'm using `assert_allclose` with default tolerances on the tests as suggested. pandas : The pandas library refuses to resample certain indices and throws a "values falls before first bin" error. The error comes from `bins=lib.generate_bins_dt64(...)` around line 1400 of `pandas/core/resample.py` and is a direct consequence of the `_adjust_bin_edges` operation adding 1 extra day minus 1 nanosecond causing the first value of sorted `bin_edges` to be larger than the first sorted `ax_values`. My current workaround is to use `pytest.mark.xfail(raises=ValueError)`. CFTimeIndex resampling does not encounter the same error. Nevertheless, I've changed the CFTimeIndex resampling logic so that the first bin value does not have 1 day minus 1 microsecond added to it to (hopefully) rectify the error. Testing against pandas resampling results does not show any difference between the corrected and uncorrected CFTimeIndex resampling code. xarray : Ignoring the aforementioned issue with pandas, xarray resampling results for certain time ranges do not match pandas', specifically these two: `dict(start='1892-01-01T12:00:00', periods=15, freq='5256113T'),` labeled `XT`, and `dict(start='1892', periods=10, freq='6AS-JUN')`, labeled `6AS_JUN`. `XT` seems to be causing the most problem, which might be due to its rather challenging `freq` specification. Since I've rewritten `test_cftimeindex_resample.py` based on your gists, a lot more test cases are being generated. Without `XT` and `6AS_JUN`, the tests take about 40 minutes to run on my machine; including them bumps that time up to 3 hours. The number of tests should be pared down prior to merging but I think they're helpful right now for identifying problems. I've included test results in XML for you and other collaborators to compare against. One file contains the results with the 1 day minus 1 microsecond fix applied and the other is without the fix. They can be imported into PyCharm, but I'm not sure if they can be read any other way. Test Results - pytest_in_test_cftimeindex_resample_py.zip	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		387924616