home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 278390511

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/1255#issuecomment-278390511 https://api.github.com/repos/pydata/xarray/issues/1255 278390511 MDEyOklzc3VlQ29tbWVudDI3ODM5MDUxMQ== 1217238 2017-02-08T17:02:38Z 2017-02-08T17:02:38Z MEMBER

The usual place to start is with profiling. Here's what %prun xr.concat(das) gets me: https://gist.github.com/shoyer/97dff50d5892e3437d4864d93e85ead2

So indeed, each index is getting converted into a NumPy array. Profiling suggests Variable.equals is a likely culprit (0.863/1.159 seconds) and indeed we call .data there: https://github.com/pydata/xarray/blob/d49014d0357d08289cf99aeb54bcefe48ed6e7a0/xarray/core/variable.py#L1000

The good news is that pandas already has it's own vectorized .equals method for indexes. So we should either add a special case if ._data is a PandasIndexWrapper for the generic Variable.equals method, or perhaps better yet add a subclass method IndexVariable.equals that always call .equals on the indexes instead of caling ops.array_equiv on .data.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  206241490
Powered by Datasette · Queries took 14.661ms · About: xarray-datasette