html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2452#issuecomment-426335414,https://api.github.com/repos/pydata/xarray/issues/2452,426335414,MDEyOklzc3VlQ29tbWVudDQyNjMzNTQxNA==,5635139,2018-10-02T16:15:00Z,2018-10-02T16:15:00Z,MEMBER,"Thanks @mschrimpf. Hopefully we can get multi-dimensional groupbys, too. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,365678022
https://github.com/pydata/xarray/issues/2452#issuecomment-426106046,https://api.github.com/repos/pydata/xarray/issues/2452,426106046,MDEyOklzc3VlQ29tbWVudDQyNjEwNjA0Ng==,5635139,2018-10-02T00:21:17Z,2018-10-02T00:21:17Z,MEMBER,">  Is there a way of vectorizing these calls with that in mind? I.e. apply a method for each group.

I can't think of anything immediately, and doubt there's an easy way given it doesn't exist yet (though that logic can be a trap!). There's some hacky pandas reshaping you may be able to do to solve this as a one-off. Otherwise it does likely require a concerted effort with numbagg.

I occasionally hit this issue too, so as keen as you are to find a solution. Thanks for giving it a try.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,365678022
https://github.com/pydata/xarray/issues/2452#issuecomment-426096521,https://api.github.com/repos/pydata/xarray/issues/2452,426096521,MDEyOklzc3VlQ29tbWVudDQyNjA5NjUyMQ==,5635139,2018-10-01T23:25:01Z,2018-10-01T23:25:01Z,MEMBER,"Thanks for the issue @mschrimpf 

`.sel` is slow per operation, mainly because it's a python function call (although not the only reason - it's also doing a set of checks / potential alignments / etc). When I say 'slow', I mean about 0.5ms:
```
In [6]: %timeit d.sel(a='a', b='a')
1000 loops, best of 3: 522 µs per loop
```

While there's an overhead, the time is fairly consistent regardless of the number of items it's selecting. For example:
```
In [11]: %timeit d.sel(a=d['a'], b=d['b'])
1000 loops, best of 3: 1 ms per loop
```

So, as is often the case in the pandas / python ecosystem, if you can write code in a vectorized way, without using python in the tight loops, it's fast. If you need to run python in each loop, it's much slower.

Does that resonate? 

---
While I think not the main point here, there might be some optimizations on `sel`. It runs `isinstance` 144 times! And initializes a collection 13 times? Here's the `%prun` of the 0.5ms command:

<details>
```
        1077 function calls (1066 primitive calls) in 0.002 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        6    0.000    0.000    0.000    0.000 coordinates.py:169(<genexpr>)
       13    0.000    0.000    0.000    0.000 collections.py:50(__init__)
       14    0.000    0.000    0.000    0.000 _abcoll.py:548(update)
       33    0.000    0.000    0.000    0.000 _weakrefset.py:70(__contains__)
        2    0.000    0.000    0.001    0.000 dataset.py:881(_construct_dataarray)
      144    0.000    0.000    0.000    0.000 {isinstance}
        1    0.000    0.000    0.001    0.001 dataset.py:1496(isel)
       18    0.000    0.000    0.000    0.000 {numpy.core.multiarray.array}
        3    0.000    0.000    0.000    0.000 dataset.py:92(calculate_dimensions)
       13    0.000    0.000    0.000    0.000 abc.py:128(__instancecheck__)
       36    0.000    0.000    0.000    0.000 common.py:183(__setattr__)
        2    0.000    0.000    0.000    0.000 coordinates.py:167(variables)
        2    0.000    0.000    0.000    0.000 {method 'get_loc' of 'pandas._libs.index.IndexEngine' objects}
       26    0.000    0.000    0.000    0.000 variable.py:271(shape)
       65    0.000    0.000    0.000    0.000 collections.py:90(__iter__)
        5    0.000    0.000    0.000    0.000 variable.py:136(as_compatible_data)
        3    0.000    0.000    0.000    0.000 dataarray.py:165(__init__)
        2    0.000    0.000    0.000    0.000 indexing.py:1255(__getitem__)
        3    0.000    0.000    0.000    0.000 variable.py:880(isel)
       14    0.000    0.000    0.000    0.000 collections.py:71(__setitem__)
        1    0.000    0.000    0.000    0.000 dataset.py:1414(_validate_indexers)
        6    0.000    0.000    0.000    0.000 coordinates.py:38(__iter__)
        3    0.000    0.000    0.000    0.000 variable.py:433(_broadcast_indexes)
        2    0.000    0.000    0.000    0.000 variable.py:1826(to_index)
        3    0.000    0.000    0.000    0.000 dataset.py:636(_construct_direct)
        2    0.000    0.000    0.000    0.000 indexing.py:122(convert_label_indexer)
       15    0.000    0.000    0.000    0.000 utils.py:306(__init__)
        3    0.000    0.000    0.000    0.000 indexing.py:17(expanded_indexer)
       28    0.000    0.000    0.000    0.000 collections.py:138(iteritems)
        1    0.000    0.000    0.001    0.001 indexing.py:226(remap_label_indexers)
       15    0.000    0.000    0.000    0.000 numeric.py:424(asarray)
        1    0.000    0.000    0.001    0.001 indexing.py:193(get_dim_indexers)
    80/70    0.000    0.000    0.000    0.000 {len}
```
</details>

","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,365678022