home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 1236174701 and user = 2448579 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date)

user 1

  • dcherian · 3 ✖

issue 1

  • Update GroupBy constructor for grouping by multiple variables, dask arrays · 3 ✖

author_association 1

  • MEMBER 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1523666774 https://github.com/pydata/xarray/issues/6610#issuecomment-1523666774 https://api.github.com/repos/pydata/xarray/issues/6610 IC_kwDOAMm_X85a0U9W dcherian 2448579 2023-04-26T15:59:06Z 2023-04-26T16:06:17Z MEMBER

We voted to move forward with this API: python data.groupby({ "x0": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning "y": xr.UniqueGrouper(labels=["a", "b", "c"]), # categorical, data.y is dask-backed "time": xr.TimeResampleGrouper(freq="MS") }, )

We won't break backwards-compatibility for da.groupby(other_data_array) but for any complicated use-cases with Grouper the user must add the by variable to the xarray object, and refer to it by name in the dictionary as above,

{
    "total_count": 4,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 1
}
  Update GroupBy constructor for grouping by multiple variables, dask arrays 1236174701
1498463195 https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195 https://api.github.com/repos/pydata/xarray/issues/6610 IC_kwDOAMm_X85ZULvb dcherian 2448579 2023-04-06T04:07:05Z 2023-04-26T15:52:21Z MEMBER

Here's a question.

In #7561, I implement Grouper objects that don't have any information of the variable we're grouping by. So the future API would be:

python data.groupby({ "x0": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning "y": xr.UniqueGrouper(labels=["a", "b", "c"]), # categorical, data.y is dask-backed "time": xr.TimeResampleGrouper(freq="MS") }, )

Does this look OK or do we want to support passing the DataArray or variable name as a by kwarg:
python xr.BinGrouper(by="x0", bins=pd.IntervalIndex.from_breaks(coords["x_vertices"]))

This syntax would support passing DataArray in by so xr.UniqueGrouper(by=data.y) for example. Is that an important usecase to support? In #7561, I create new ResolvedGrouper objects that do contain by as a DataArray always, so it's really a question of exposing that to the user.

PS: Pandas has a key kwarg for a column name. So following that would mean

python data.groupby([ xr.BinGrouper("x0", bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning xr.UniqueGrouper("y", labels=["a", "b", "c"]), # categorical, data.y is dask-backed xr.TimeResampleGrouper("time", freq="MS") ], )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Update GroupBy constructor for grouping by multiple variables, dask arrays 1236174701
1329680642 https://github.com/pydata/xarray/issues/6610#issuecomment-1329680642 https://api.github.com/repos/pydata/xarray/issues/6610 IC_kwDOAMm_X85PQVEC dcherian 2448579 2022-11-28T19:58:29Z 2022-11-28T23:23:42Z MEMBER

In https://github.com/xarray-contrib/flox/issues/191 @keewis proposes a much nicer API for multiple variables:

python data.groupby( xr.Grouper(by="x", bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning xr.Grouper(by=data.y, labels=["a", "b", "c"]), # categorical, data.y is dask-backed xr.Grouper(by="time", freq="MS"), # resample )

Note pd.Grouper uses key instead of by so that's a possibility too.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Update GroupBy constructor for grouping by multiple variables, dask arrays 1236174701

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 4101.916ms · About: xarray-datasette