home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 511611430

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2064#issuecomment-511611430 https://api.github.com/repos/pydata/xarray/issues/2064 511611430 MDEyOklzc3VlQ29tbWVudDUxMTYxMTQzMA== 1217238 2019-07-15T23:54:47Z 2019-07-15T23:54:47Z MEMBER

The logic for determining which variables to concatenate is in the _calc_concat_over helper function: https://github.com/pydata/xarray/blob/539fb4a98d0961c281daa5474a8e492a0ae1d8a2/xarray/core/concat.py#L146

Only "different" is supposed to load variables into memory to determine which ones to concatenate.

Right now we also have "all" and "minimal" options: - "all" attempts to concatenate every variable that can be broadcast to a matching shape: https://github.com/pydata/xarray/blob/539fb4a98d0961c281daa5474a8e492a0ae1d8a2/xarray/core/concat.py#L188-L190 - "minimal" only concatenates variables that already have the matching dimension.

Recall that concat handles two types of concatenation: existing dimensions (corresponding to np.concatenate) and new dimensions (corresponding to np.stack). Currently, this is all done together in one messy codebase, but logically it would be cleaner to separate these modes into two separate function: - In "existing dimensions" mode: - "all" is currently broken, because it will also concatenate variables that don't have the dimension. - "minimal" does the right thing, concatenating only variables with the dimension. - In "new dimensions" mode: - "all" will add the dimension to all variables. - "minimal" raise an error if any variables have different values. If you're datasets have any data variables with different values at all, it raises an error. This is pretty much useless.

Here's my notebook testing this out: https://gist.github.com/shoyer/f44300eddda4f7c476c61f76d1df938b

So I'm thinking that we probably want to combine "all" and "minimal" into a single mode to use as the default, and remove the other behavior, which is either useless or broken. Maybe it would make sense to come up with a new name for this mode, and to make both "all" and "minimal" deprecated aliases for it? In the long term, this leaves only two "automatic" modes for xarray.concat, which should make things simpler for users trying to figure this out.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  314764258
Powered by Datasette · Queries took 0.823ms · About: xarray-datasette