home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

1 row where author_association = "NONE", issue = 245649333 and user = 4992424 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • darothen · 1 ✖

issue 1

  • Resample not working when time coordinate is timezone aware · 1 ✖

author_association 1

  • NONE · 1 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
318079611 https://github.com/pydata/xarray/issues/1490#issuecomment-318079611 https://api.github.com/repos/pydata/xarray/issues/1490 MDEyOklzc3VlQ29tbWVudDMxODA3OTYxMQ== darothen 4992424 2017-07-26T14:57:58Z 2017-07-26T14:57:58Z NONE

Did some digging.

Note here that the dtypes of time1 and time2 are different; the first is a datetime64[ns] but the second is a datetime64[ns, UTC]. For the sake of illustration, I'm going to change the timezone to EST. If we print time2, we get something that looks like this:

``` python

time2 DatetimeIndex(['2000-01-01 00:00:00-05:00', '2000-01-01 01:00:00-05:00', '2000-01-01 02:00:00-05:00', '2000-01-01 03:00:00-05:00', '2000-01-01 04:00:00-05:00', '2000-01-01 05:00:00-05:00', '2000-01-01 06:00:00-05:00', '2000-01-01 07:00:00-05:00', '2000-01-01 08:00:00-05:00', '2000-01-01 09:00:00-05:00', ... '2000-12-30 14:00:00-05:00', '2000-12-30 15:00:00-05:00', '2000-12-30 16:00:00-05:00', '2000-12-30 17:00:00-05:00', '2000-12-30 18:00:00-05:00', '2000-12-30 19:00:00-05:00', '2000-12-30 20:00:00-05:00', '2000-12-30 21:00:00-05:00', '2000-12-30 22:00:00-05:00', '2000-12-30 23:00:00-05:00'], dtype='datetime64[ns, EST]', length=8760, freq='H') ```

But, if we directly print its values, we get something slightly different:

``` python

time2.values array(['2000-01-01T05:00:00.000000000', '2000-01-01T06:00:00.000000000', '2000-01-01T07:00:00.000000000', ..., '2000-12-31T02:00:00.000000000', '2000-12-31T03:00:00.000000000', '2000-12-31T04:00:00.000000000'], dtype='datetime64[ns]') ```

The difference is that the timezone delta has been automatically added in terms of hours to each value in time2. This brings up something to note: if you construct your Dataset using time1.values and time2.values, there is no problem:

python import pandas as pd import xarray as xr time1 = pd.date_range('2000-01-01', freq='H', periods=365 * 24) #timezone naïve time2 = pd.date_range('2000-01-01', freq='H', periods=365 * 24, tz='UTC') #timezone aware ds1 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time1.values}) ds2 = xr.Dataset({'foo': ('time', np.arange(365 * 24)), 'time': time2.values}) ds1.resample('3H', 'time', how='mean') # works fine ds2.resample('3H', 'time', how='mean') # works fine

Both time1 and time2 are instances of pd.DatetimeIndex which are subclasses of pd.Index. When xarray tries to turn them into Variables, it ultimately uses a PandasIndexAdapter to decode the contents of time1 and time2, and this is where the trouble happens. The PandasIndexAdapter tries to safely cast the dtype of the array it is passed, which works just fine for time1. But for some weird reason, numpy doesn't recognize its own datetime dtypes when they have timezone information. That is, this will work:

``` python

np.dtype('datetime64[ns]') dtype('<M8[ns]') ``` But this won't:

``` python

np.dtype('datetime64[ns, UTC]') TypeError: Invalid datetime unit in metadata string "[ns, UC]" ```

But also, the type of time2.dtype is a pandas.types.dtypes.DatetimeTZDtype, which NumPy doesn't know what to do with (it doesn't know how to map that type to its own datetime64).

So what happens is that the resulting Variable which defines the time coordinate on your ds2 has an array with the correct values, but is explicitly told to have the dtype object. When the array is decoded, then, bad things happen.

One solution would be to catch this potential glitch in either is_valid_numpy_dtype() or the PandasIndexAdapter constructor. Alternatively, we could eagerly coerce arrays with type pandas.types.dtypes.DatetimeTZDtype into numpy-compliant types at some earlier point.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Resample not working when time coordinate is timezone aware 245649333

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.613ms · About: xarray-datasette