home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 511903346

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2064#issuecomment-511903346 https://api.github.com/repos/pydata/xarray/issues/2064 511903346 MDEyOklzc3VlQ29tbWVudDUxMTkwMzM0Ng== 10638475 2019-07-16T17:06:46Z 2019-07-16T17:47:45Z NONE

@shoyer Your explanation makes sense, but there are unit tests that expect the default concat() behavior to be the same as default behavior for Pandas concat(), which tries to perform an "outer" join between dataframes.

Therefore, from my limited understanding, the default behavior for xarray concat() should be to preserve all variables. If this default behavior changes, then it may break code making these expectations.

Can we get a perspective from the author of concat.py, @TomNicholas ? Thanks.

Specifically, what should the default behavior of concat() be, when both datasets include a variable that does not include the concatenation dimension? Currently, the concat dimension is added, and the result is a "stacked" version of the variable. Others have argued that this variable should not be included in the concat() result by default, but this appears to break compatibility with Pandas concat(). Another possibility could be to include the first instance of the variable in the result set, throwing away any other instances of the same variable, so a "stacking" dimension is not needed. This would potentially lose information if the variable instances are not identical, however.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  314764258
Powered by Datasette · Queries took 0.677ms · About: xarray-datasette