home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 65114186

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/268#issuecomment-65114186 https://api.github.com/repos/pydata/xarray/issues/268 65114186 MDEyOklzc3VlQ29tbWVudDY1MTE0MTg2 1217238 2014-12-01T18:49:37Z 2014-12-01T18:49:37Z MEMBER

I finally got around to investigating this issue, and it turns out to be more subtle than I thought.

The collapsing of scalars occurs for two reasons: 1. By default. concat collapses constant variables (unless they are explicitly called out in concat_over) 2. Groupby's concatenate is intentionally agnostic about the input data, only looking at the transformed data.

The combination of these features means that the concat step of groupby has no way (currently) to tell the difference between a new variable that is only constant across the concatenated dimension by chance (e.g., because of the nature of the input data in this case) and a variable that is intentionally constant (e.g., because x was set to the scalar zero in the original dataset).

The scalar collapsing feature of concat is convenient in some cases (maybe not for groupby), but it really should be controllable and predictable. A few options: 1. Alleviate the consequences, e.g., 1. by fixing the issue with concatenating scalar with non-scalar variables (#243) 2. adding a function for manually broadcasting to a given set of dimensions (I already have most of this in the xray.broadcast_arrays function, see #261) 2. Come up with some set of rules or heuristics for which variables are always concatenated by a groupby, e.g., 1. all variables that weren't constant across groups in the original objects, and/or 2. all non-coordinate variables

Probably would be good to both (1) and (2); the former will useful regardless. I think using "all non-coordinate variables" (2 ii) might be a reasonable choice (it would be consistent with the current behavior for data arrays).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  46768521
Powered by Datasette · Queries took 0.495ms · About: xarray-datasette