home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 92762200 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • shoyer 2
  • wholmgren 1

author_association 2

  • MEMBER 2
  • NONE 1

issue 1

  • min/max errors if data variables have string or unicode type · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
118447451 https://github.com/pydata/xarray/issues/453#issuecomment-118447451 https://api.github.com/repos/pydata/xarray/issues/453 MDEyOklzc3VlQ29tbWVudDExODQ0NzQ1MQ== shoyer 1217238 2015-07-04T01:09:10Z 2015-07-04T01:09:10Z MEMBER

The reason for not using numeric only for max/min is that they should be well defined even for strings and dates -- unlike aggregations like mean, sum, variance (actually, in principle most should be able to work OK for dates but the numpy codes has some bugs we would need to work around).

.

The bytes handling in to_datetime is arguably a pandas bug. Alternatively we could decode character arrays from netcdf as unicode instead of bytes, but I'm not sure that's unambiguously the right thing to do. This is a place where the legacy Python 2 distinction of strings/unicode is a closer match for netcdf (and scientific file formats more generally) than the Python 3 behavior.

On Fri, Jul 3, 2015 at 4:40 PM, Will Holmgren notifications@github.com wrote:

Thanks for the tips. This may be Python 3 specific, but I needed to convert to convert to strings first

python times_strings = list(map(lambda x: x.decode('utf-8'), ds['Times'].values)) ds['Times'] = ('Time', pd.to_datetime(times_strings, format='%Y-%m-%d_%H:%M:%S'))

Is there a reason why you don't use numeric_only=True for the min and max functions? I was just recommending more consistency across the min/max/mean/std/etc functions. Might also be good to be explicit about that in the doc strings.

Reply to this email directly or view it on GitHub: https://github.com/xray/xray/issues/453#issuecomment-118437972

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  min/max errors if data variables have string or unicode type 92762200
118437972 https://github.com/pydata/xarray/issues/453#issuecomment-118437972 https://api.github.com/repos/pydata/xarray/issues/453 MDEyOklzc3VlQ29tbWVudDExODQzNzk3Mg== wholmgren 4383303 2015-07-03T23:40:50Z 2015-07-03T23:40:50Z NONE

Thanks for the tips. This may be Python 3 specific, but I needed to convert to convert to strings first

python times_strings = list(map(lambda x: x.decode('utf-8'), ds['Times'].values)) ds['Times'] = ('Time', pd.to_datetime(times_strings, format='%Y-%m-%d_%H:%M:%S'))

Is there a reason why you don't use numeric_only=True for the min and max functions? I was just recommending more consistency across the min/max/mean/std/etc functions. Might also be good to be explicit about that in the doc strings.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  min/max errors if data variables have string or unicode type 92762200
118211474 https://github.com/pydata/xarray/issues/453#issuecomment-118211474 https://api.github.com/repos/pydata/xarray/issues/453 MDEyOklzc3VlQ29tbWVudDExODIxMTQ3NA== shoyer 1217238 2015-07-03T02:13:01Z 2015-07-03T02:13:01Z MEMBER

I agree, it's not friendly to give an error message here.

Something you could do about this -- you probably want to convert your times into the numpy datetime64 type. That makes your operations much more efficient, and would make .min() work:

ds['Times'] = ('Time', pd.to_datetime(ds['Times'], format='%Y-%m-%d_%H:%M:%S'))

You also probably want to make this Times variable the dimension variable -- that will let you select times with datetime objects or strings instead of integers: ds.swap_dims({"Times": "Time"}).

Or in one line:

ds = ds.assign(Time=pd.to_datetime(ds['Times'], format='%Y-%m-%d_%H:%M:%S')).drop('Times')

Something xray do could about this -- we could convert string/unicode arrays into the numpy object dtype prior to attempting operations like min/argmin. That way, the min operation would still work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  min/max errors if data variables have string or unicode type 92762200

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.846ms · About: xarray-datasette