home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 559873728 and user = 6628425 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • spencerkclark · 10 ✖

issue 1

  • more upstream-dev cftime failures · 10 ✖

author_association 1

  • MEMBER 10
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
602221579 https://github.com/pydata/xarray/issues/3751#issuecomment-602221579 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDYwMjIyMTU3OQ== spencerkclark 6628425 2020-03-22T15:01:35Z 2020-03-22T15:01:35Z MEMBER

Thanks @jbrockmendel. I didn't realize you had a few downstream tests; that's great. See https://github.com/pandas-dev/pandas/pull/32905.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
602053723 https://github.com/pydata/xarray/issues/3751#issuecomment-602053723 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDYwMjA1MzcyMw== spencerkclark 6628425 2020-03-21T14:35:11Z 2020-03-21T14:35:33Z MEMBER

Thanks @jbrockmendel; it's great to see that pandas-dev/pandas#32684 was merged.

Regarding #3764, I gave things a try with pandas master and removing our overrides to _get_nearest_indexer and _filter_indexer_tolerance. I got similar failures to what we were getting before.

Example failure

``` _______________________________ test_sel_date_scalar_nearest[proleptic_gregorian-sel_kwargs2] _______________________________ da = <xarray.DataArray (time: 4)> array([1, 2, 3, 4]) Coordinates: * time (time) object 0001-01-01 00:00:00 ... 0002-02-01 00:00:00 date_type = <class 'cftime._cftime.DatetimeProlepticGregorian'> index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00, 0002-02-01 00:00:00], dtype='object') sel_kwargs = {'method': 'nearest', 'tolerance': datetime.timedelta(days=1800000)} @requires_cftime @pytest.mark.parametrize( "sel_kwargs", [ {"method": "nearest"}, {"method": "nearest", "tolerance": timedelta(days=70)}, {"method": "nearest", "tolerance": timedelta(days=1800000)}, ], ) def test_sel_date_scalar_nearest(da, date_type, index, sel_kwargs): expected = xr.DataArray(2).assign_coords(time=index[1]) > result = da.sel(time=date_type(1, 4, 1), **sel_kwargs) test_cftimeindex.py:471: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../core/dataarray.py:1061: in sel **indexers_kwargs, ../core/dataset.py:2066: in sel self, indexers=indexers, method=method, tolerance=tolerance ../core/coordinates.py:392: in remap_label_indexers obj, v_indexers, method=method, tolerance=tolerance ../core/indexing.py:270: in remap_label_indexers idxr, new_idx = convert_label_indexer(index, label, dim, method, tolerance) ../core/indexing.py:190: in convert_label_indexer label.item(), method=method, tolerance=tolerance ../coding/cftimeindex.py:365: in get_loc return pd.Index.get_loc(self, key, method=method, tolerance=tolerance) ../../../pandas/pandas/core/indexes/base.py:2874: in get_loc indexer = self.get_indexer([key], method=method, tolerance=tolerance) ../../../pandas/pandas/core/indexes/base.py:2967: in get_indexer indexer = self._get_nearest_indexer(target, limit, tolerance) ../../../pandas/pandas/core/indexes/base.py:3062: in _get_nearest_indexer indexer = self._filter_indexer_tolerance(target, indexer, tolerance) ../../../pandas/pandas/core/indexes/base.py:3069: in _filter_indexer_tolerance indexer = np.where(distance <= tolerance, indexer, -1) ../../../pandas/pandas/core/indexes/extension.py:129: in wrapper return op(other) ../../../pandas/pandas/core/ops/common.py:63: in new_method return method(self, other) ../../../pandas/pandas/core/arrays/datetimelike.py:75: in wrapper other = self._scalar_type(other) pandas/_libs/tslibs/timedeltas.pyx:1233: in pandas._libs.tslibs.timedeltas.Timedelta.__new__ ??? pandas/_libs/tslibs/timedeltas.pyx:209: in pandas._libs.tslibs.timedeltas.convert_to_timedelta64 ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E OverflowError: Python int too large to convert to C long pandas/_libs/tslibs/timedeltas.pyx:154: OverflowError ```

In my testing, I can only get things to work if the argument to abs (or np.abs) is a NumPy array (instead of an Index). An upstream change like this would work (it still maintains the behavior desired from pandas-dev/pandas#31511), but I'm not sure how palatable it would be.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
599231986 https://github.com/pydata/xarray/issues/3751#issuecomment-599231986 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU5OTIzMTk4Ng== spencerkclark 6628425 2020-03-15T16:24:07Z 2020-03-15T16:24:07Z MEMBER

Thanks so much @jbrockmendel for looking into the __getitem__ issue upstream. That should be the last of the issues that remains from this thread. As you probably noticed, we ended up merging #3764, which fixed indexing with the "nearest" method.

Once pydata/pandas#32684 is merged, we should be able to un-xfail the Series __getitem__ tests on our end.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 1,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
597888734 https://github.com/pydata/xarray/issues/3751#issuecomment-597888734 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU5Nzg4ODczNA== spencerkclark 6628425 2020-03-11T21:32:53Z 2020-03-11T21:32:53Z MEMBER

Thanks @jbrockmendel -- I'll try to do that this weekend.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
592603121 https://github.com/pydata/xarray/issues/3751#issuecomment-592603121 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU5MjYwMzEyMQ== spencerkclark 6628425 2020-02-28T16:56:25Z 2020-02-28T16:56:25Z MEMBER

Thanks @jbrockmendel -- I think there are two separate issues: - Indexing with __getitem__ in a Series would be solved by the is_scalar fix. - Indexing with the "nearest" method would require reverting pandas-dev/pandas#31511 for non-Datetime-or-Period-Indexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
592487396 https://github.com/pydata/xarray/issues/3751#issuecomment-592487396 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU5MjQ4NzM5Ng== spencerkclark 6628425 2020-02-28T12:13:27Z 2020-02-28T12:13:27Z MEMBER

@jbrockmendel @TomAugspurger it turns out that fixing indexing with the "nearest" method without overriding private methods of pandas.Index is harder than I expected within xarray alone (see https://github.com/pydata/xarray/pull/3764#issuecomment-586597512 for more details). Do you think an upstream fix would be acceptable here? My understanding is that the issue that prompted the change in pandas only pertained to DatetimeIndexes (or perhaps maybe also PeriodIndexes in the future); would it be possible to limit the scope of the updates in pandas-dev/pandas#31511 to just those?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
587438020 https://github.com/pydata/xarray/issues/3751#issuecomment-587438020 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU4NzQzODAyMA== spencerkclark 6628425 2020-02-18T12:32:40Z 2020-02-18T12:32:40Z MEMBER

Another kind of failure came up in the context of indexing a Series with a cftime.datetime object:

Example failure

``` ____________________________ test_indexing_in_series_getitem[365_day] _____________________________ series = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 index = CFTimeIndex([0001-01-01 00:00:00, 0001-02-01 00:00:00, 0002-01-01 00:00:00, 0002-02-01 00:00:00], dtype='object') scalar_args = [cftime.DatetimeNoLeap(0001-01-01 00:00:00)] range_args = ['0001', slice('0001-01-01', '0001-12-30', None), slice(None, '0001-12-30', None), slice(cftime.DatetimeNoLeap(0001-01...:00), cftime.DatetimeNoLeap(0001-12-30 00:00:00), None), slice(None, cftime.DatetimeNoLeap(0001-12-30 00:00:00), None)] @requires_cftime def test_indexing_in_series_getitem(series, index, scalar_args, range_args): for arg in scalar_args: > assert series[arg] == 1 test_cftimeindex.py:597: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../../pandas/pandas/core/series.py:884: in __getitem__ return self._get_with(key) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = 0001-01-01 00:00:00 1 0001-02-01 00:00:00 2 0002-01-01 00:00:00 3 0002-02-01 00:00:00 4 dtype: int64 key = cftime.DatetimeNoLeap(0001-01-01 00:00:00) def _get_with(self, key): # other: fancy integer or otherwise if isinstance(key, slice): # _convert_slice_indexer to determing if this slice is positional # or label based, and if the latter, convert to positional slobj = self.index._convert_slice_indexer(key, kind="getitem") return self._slice(slobj) elif isinstance(key, ABCDataFrame): raise TypeError( "Indexing a Series with DataFrame is not " "supported, use the appropriate DataFrame column" ) elif isinstance(key, tuple): try: return self._get_values_tuple(key) except ValueError: # if we don't have a MultiIndex, we may still be able to handle # a 1-tuple. see test_1tuple_without_multiindex if len(key) == 1: key = key[0] if isinstance(key, slice): return self._get_values(key) raise if not isinstance(key, (list, np.ndarray, ExtensionArray, Series, Index)): > key = list(key) E TypeError: 'cftime._cftime.DatetimeNoLeap' object is not iterable ../../../pandas/pandas/core/series.py:911: TypeError ```

Admittedly I think most people probably use a CFTimeIndex within xarray data structures, but it would be nice to maintain some ability to use it in pandas data structures too. This issue stems from the changes made in https://github.com/pandas-dev/pandas/pull/31399. I think the problem is that pandas.core.dtypes.common.is_scalar returns False for a cftime.datetime object:

``` In [1]: import cftime

In [2]: import pandas

In [3]: pandas.core.dtypes.common.is_scalar(cftime.DatetimeNoLeap(2000, 1, 1)) Out[3]: False ```

Could there be a simple upstream fix for this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
583366068 https://github.com/pydata/xarray/issues/3751#issuecomment-583366068 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU4MzM2NjA2OA== spencerkclark 6628425 2020-02-07T12:16:13Z 2020-02-07T12:16:13Z MEMBER

I think the issue here is that other is a pandas.Index instead of a CFTimeIndex.

Yes, I noted that in my original post (sorry if that wasn't clear).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
583355144 https://github.com/pydata/xarray/issues/3751#issuecomment-583355144 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU4MzM1NTE0NA== spencerkclark 6628425 2020-02-07T11:40:44Z 2020-02-07T11:40:44Z MEMBER

FWIW, I think @jbrockmendel is still progressing on an "extension index" interface where you could have a custom dtype / Index subclass that would be properly supported. Long-term, that's the best solution.

Nice -- I look forward to being able to try that out!

any idea what other is here? looks like it might be a DatetimeIndex

@jbrockmendel agreed that it's unclear -- we probably should have written the code for that method in a clearer way. I think it's mainly used for subtracting a single datetime.timedelta or NumPy array of datetime.timedelta objects from a CFTimeIndex. In both of those cases we would expect the result to remain a CFTimeIndex.

cftime.datetime objects often represent dates from non-standard calendars (e.g. calendars with no leap year, or calendars where all months have 30 days), so in general they are not compatible with the dates used in a DatetimeIndex. Subtracting like-calendar cftime.datetime objects is fair game though, and we'd like that to produce timedeltas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728
582904457 https://github.com/pydata/xarray/issues/3751#issuecomment-582904457 https://api.github.com/repos/pydata/xarray/issues/3751 MDEyOklzc3VlQ29tbWVudDU4MjkwNDQ1Nw== spencerkclark 6628425 2020-02-06T13:25:47Z 2020-02-06T13:34:51Z MEMBER

Part of the hazard of using a pd.Index subclass I suppose...

It looks like https://github.com/pandas-dev/pandas/pull/31511 was the cause of the issue: ``` $ git bisect good 64336ff8414f8977ff94adb9a5bc000a3a4ef454 is the first bad commit commit 64336ff8414f8977ff94adb9a5bc000a3a4ef454 Author: Kevin Anderson 57452607+kanderso-nrel@users.noreply.github.com Date: Sun Feb 2 20:48:28 2020 -0700

BUG: fix reindexing with a tz-aware index and method='nearest' (#31511)

doc/source/whatsnew/v1.1.0.rst | 2 +- pandas/core/indexes/base.py | 5 ++--- pandas/tests/frame/indexing/test_indexing.py | 10 ++++++++++ 3 files changed, 13 insertions(+), 4 deletions(-) `` A way to fix this upstream would be to make sure thattargethas the same type as the index (here it is a genericpd.Indexinstead of aCFTimeIndex`), but I'm not sure how hard that would be (or if it makes sense in all cases): https://github.com/pandas-dev/pandas/blob/a2a35a86c4064d297c8b48ecfea80e9f05e27712/pandas/core/indexes/base.py#L3080

I think it's possible we could work around this in xarray. It comes down to properly recognizing what to do when you subtract a generic pd.Index of cftime.datetime objects from a CFTimeIndex. Previously this code in pandas operated strictly using NumPy arrays, so there was no casting issue when doing the subtraction.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  more upstream-dev cftime failures 559873728

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 20.383ms · About: xarray-datasette