home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "NONE" and user = 5629061 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 3

  • "Reverse" groupby method for split/apply/combine 2
  • timedelta64[D] is always coerced to timedelta64[ns] 2
  • Support multi-dimensional grouped operations and group_over 1

user 1

  • hottwaj · 5 ✖

author_association 1

  • NONE · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
265462343 https://github.com/pydata/xarray/issues/324#issuecomment-265462343 https://api.github.com/repos/pydata/xarray/issues/324 MDEyOklzc3VlQ29tbWVudDI2NTQ2MjM0Mw== hottwaj 5629061 2016-12-07T14:35:01Z 2016-12-07T14:35:01Z NONE

In case it is of interest to anyone, the snippet below is a temporary and quite dirty solution I've used to do a multi-dimensional groupby...

It runs nested groupby-apply operations over each given dimension until no further grouping needs to be done, then applies the given function "apply_fn"

def nested_groupby_apply(dataarray, groupby, apply_fn): if len(groupby) == 1: return dataarray.groupby(groupby[0]).apply(apply_fn) else: return dataarray.groupby(groupby[0]).apply(nested_groupby_apply, groupby = groupby[1:], apply_fn = apply_fn)

Obviously performance can potentially be quite poor. Passing the dimensions to group over in order of increasing length will reduce your cost a little.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Support multi-dimensional grouped operations and group_over 58117200
264133419 https://github.com/pydata/xarray/issues/1143#issuecomment-264133419 https://api.github.com/repos/pydata/xarray/issues/1143 MDEyOklzc3VlQ29tbWVudDI2NDEzMzQxOQ== hottwaj 5629061 2016-12-01T10:15:21Z 2016-12-01T10:15:21Z NONE

The pandas docs do seem to say that conversion to timedelta64[D] (or other frequencies) is possible - see: http://pandas.pydata.org/pandas-docs/stable/timedeltas.html#frequency-conversion

Also here's a more realistic example of why this is problematic for me - I have a sequence of dates and I want to calculate the difference between them in days: possible in pandas, but not possible in xarray without first reverting to pandas/numpy types

``` dates = pandas.Series([datetime.date(2016, 01, 10), datetime.date(2016, 01, 20), datetime.date(2016, 01, 25)]).astype('datetime64[ns]')

dates.diff().astype('timedelta64[D]').astype(float)

returns

0 NaN

1 10.0

2 5.0

dtype: float6

xarray.DataArray(dates).diff(dim = 'dim_0').astype('timedelta64[D]').astype(float)

returns

<xarray.DataArray (dim_0: 2)>

array([ 8.64000000e+14, 4.32000000e+14])

Coordinates:

* dim_0 (dim_0) int64 1 2

```

Again the xarray result is in ns rather than days.

Thanks

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  timedelta64[D] is always coerced to timedelta64[ns] 192325490
263626446 https://github.com/pydata/xarray/issues/1143#issuecomment-263626446 https://api.github.com/repos/pydata/xarray/issues/1143 MDEyOklzc3VlQ29tbWVudDI2MzYyNjQ0Ng== hottwaj 5629061 2016-11-29T16:47:25Z 2016-11-29T16:47:25Z NONE

The conversion to timedelta64[ns] is done on this line of code: https://github.com/pydata/xarray/blob/d66f673ab25fe0fc0483bd5d67479fc94a14e46d/xarray/core/variable.py#L169

Is there a reason behind the conversion, or could it be removed?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  timedelta64[D] is always coerced to timedelta64[ns] 192325490
211359862 https://github.com/pydata/xarray/issues/830#issuecomment-211359862 https://api.github.com/repos/pydata/xarray/issues/830 MDEyOklzc3VlQ29tbWVudDIxMTM1OTg2Mg== hottwaj 5629061 2016-04-18T12:35:59Z 2016-04-18T12:35:59Z NONE

Wooah, I'm so sorry, I didn't realise that groupby() cannot be applied to multiple dimensions yet!

So none of this works. Please ignore and I'll revisit when #818 is resolved

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "Reverse" groupby method for split/apply/combine 149130368
211357528 https://github.com/pydata/xarray/issues/830#issuecomment-211357528 https://api.github.com/repos/pydata/xarray/issues/830 MDEyOklzc3VlQ29tbWVudDIxMTM1NzUyOA== hottwaj 5629061 2016-04-18T12:25:43Z 2016-04-18T12:25:43Z NONE

Note that this new function cannot support passing of coordinates.

In fact I feel that the current groupby() implementation should not accept coordinates either - that should be up to the user to do in a separate step using .sel() or equivalent methods.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  "Reverse" groupby method for split/apply/combine 149130368

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.344ms · About: xarray-datasette