home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1498463195

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/6610#issuecomment-1498463195 https://api.github.com/repos/pydata/xarray/issues/6610 1498463195 IC_kwDOAMm_X85ZULvb 2448579 2023-04-06T04:07:05Z 2023-04-26T15:52:21Z MEMBER

Here's a question.

In #7561, I implement Grouper objects that don't have any information of the variable we're grouping by. So the future API would be:

python data.groupby({ "x0": xr.BinGrouper(bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning "y": xr.UniqueGrouper(labels=["a", "b", "c"]), # categorical, data.y is dask-backed "time": xr.TimeResampleGrouper(freq="MS") }, )

Does this look OK or do we want to support passing the DataArray or variable name as a by kwarg:
python xr.BinGrouper(by="x0", bins=pd.IntervalIndex.from_breaks(coords["x_vertices"]))

This syntax would support passing DataArray in by so xr.UniqueGrouper(by=data.y) for example. Is that an important usecase to support? In #7561, I create new ResolvedGrouper objects that do contain by as a DataArray always, so it's really a question of exposing that to the user.

PS: Pandas has a key kwarg for a column name. So following that would mean

python data.groupby([ xr.BinGrouper("x0", bins=pd.IntervalIndex.from_breaks(coords["x_vertices"])), # binning xr.UniqueGrouper("y", labels=["a", "b", "c"]), # categorical, data.y is dask-backed xr.TimeResampleGrouper("time", freq="MS") ], )

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1236174701
Powered by Datasette · Queries took 0.674ms · About: xarray-datasette