id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
365678022,MDU6SXNzdWUzNjU2NzgwMjI=,2452,DataArray.sel extremely slow,5308236,closed,0,,,5,2018-10-01T23:09:47Z,2018-10-02T16:15:00Z,2018-10-02T15:58:21Z,NONE,,,,"#### Problem description
`.sel` is an xarray method I use a lot and I would have expected it to fairly efficient.
However, even on tiny DataArrays, it takes seconds.
#### Code Sample, a copy-pastable example if possible
```python
import timeit
setup = """"""
import itertools
import numpy as np
import xarray as xr
import string
a = list(string.printable)
b = list(string.ascii_lowercase)
d = xr.DataArray(np.random.rand(len(a), len(b)), coords={'a': a, 'b': b}, dims=['a', 'b'])
d.load()
""""""
run = """"""
for _a, _b in itertools.product(a, b):
d.sel(a=_a, b=_b)
""""""
running_times = timeit.repeat(run, setup, repeat=3, number=10)
print(""xarray"", running_times) # e.g. [14.792144000064582, 15.19372400001157, 15.345327000017278]
```
#### Expected Output
I would have expected the above code to run in milliseconds.
However, it takes over 10 seconds!
Adding an additional `d = d.stack(aa=['a'], bb=['b'])` makes it even slower, about twice as slow.
For reference, a naive dict-indexing implementation in Python takes 0.01 seconds:
```python
setup = """"""
import itertools
import numpy as np
import string
a = list(string.printable)
b = list(string.ascii_lowercase)
d = np.random.rand(len(a), len(b))
indexers = {'a': {coord: index for (index, coord) in enumerate(a)},
'b': {coord: index for (index, coord) in enumerate(b)}}
""""""
run = """"""
for _a, _b in itertools.product(a, b):
index_a, index_b = indexers['a'][_a], indexers['b'][_b]
item = d[index_a][index_b]
""""""
running_times = timeit.repeat(run, setup, repeat=3, number=10)
print(""dicts"", running_times) # e.g. [0.015355999930761755, 0.01466800004709512, 0.014295000000856817]
```
#### Output of ``xr.show_versions()``
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-17134-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8
xarray: 0.10.8
pandas: 0.23.4
numpy: 1.15.1
scipy: 1.1.0
netCDF4: 1.4.1
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.2.3
cartopy: None
seaborn: None
setuptools: 40.2.0
pip: 10.0.1
conda: None
pytest: 3.7.4
IPython: 6.5.0
sphinx: None
this is a follow-up from #2438
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2452/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
363629186,MDU6SXNzdWUzNjM2MjkxODY=,2438,Efficient workaround to group by multiple dimensions,5308236,closed,0,,,3,2018-09-25T15:11:38Z,2018-10-02T15:56:53Z,2018-10-02T15:56:53Z,NONE,,,,"Grouping by multiple dimensions is not yet supported (#324):
```python
d = DataAssembly([[1, 2, 3], [4, 5, 6]],
coords={'a': ('multi_dim', ['a', 'b']), 'c': ('multi_dim', ['c', 'c']), 'b': ['x', 'y', 'z']},
dims=['multi_dim', 'b'])
d.groupby(['a', 'b']) # TypeError: `group` must be an xarray.DataArray or the name of an xarray variable or dimension
```
An inefficient solution is to run the for loops manually:
```python
a, b = np.unique(d['a'].values), np.unique(d['b'].values)
result = xr.DataArray(np.zeros([len(a), len(b)]), coords={'a': a, 'b': b}, dims=['a', 'b'])
for a, b in itertools.product(a, b):
cells = d.sel(a=a, b=b)
merge = cells.mean()
result.loc[{'a': a, 'b': b}] = merge
# result = DataArray (a: 2, b: 2)> array([[2., 3.], [5., 6.]])
# Coordinates:
# * a (a)
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.0.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-17134-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8
xarray: 0.10.8
pandas: 0.23.4
numpy: 1.15.1
scipy: 1.1.0
netCDF4: 1.4.1
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.2.3
cartopy: None
seaborn: None
setuptools: 40.2.0
pip: 10.0.1
conda: None
pytest: 3.7.4
IPython: 6.5.0
sphinx: None
Related: #324, https://stackoverflow.com/questions/52453426/grouping-by-multiple-dimensions","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2438/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
319085244,MDU6SXNzdWUzMTkwODUyNDQ=,2095,combine complementary DataArrays,5308236,closed,0,,,1,2018-05-01T01:02:26Z,2018-05-02T01:34:53Z,2018-05-02T01:34:52Z,NONE,,,,"I have a list of DataArrays with three dimensions.
For each item in the list, two of the dimensions are a single value but the combination of all items would yield the full combinatorial values.
#### Code Sample
```python
import itertools
import numpy as np
import xarray as xr
ds = []
for vals_dim1, vals_dim2 in itertools.product(list(range(2)), list(range(3))):
d = xr.DataArray(np.random.rand(1, 1, 4),
coords={'dim1': [vals_dim1], 'dim2': [vals_dim2], 'dim3': range(4)},
dims=['dim1', 'dim2', 'dim3'])
ds.append(d)
```
#### Expected Output
I then want to combine these complimentary `DataArray`s but none of what I tried so far seems to work.
The result should be a `DataArray` with shape `|2x3x4|` and dimensions `dim1: |2|, dim2: |3|, dim3: |4|`.
The following do not work:
```python
# does not automatically infer dimensions and fails with
# ""ValueError: conflicting sizes for dimension 'concat_dim': length 2 on 'concat_dim' and length 6 on ""
ds = xr.concat(ds, dim=['dim1', 'dim2'])
# will still try to insert a new `concat_dim` and fails with
# ""ValueError: conflicting MultiIndex level name(s): 'dim1' (concat_dim), (dim1) 'dim2' (concat_dim), (dim2)""
import pandas as pd
dims = [[0] * 3 + [1] * 3, list(range(3)) * 2]
dims = pd.MultiIndex.from_arrays(dims, names=['dim1', 'dim2'])
ds = xr.concat(ds, dim=dims)
# fails with
# AttributeError: 'DataArray' object has no attribute 'data_vars'
ds = xr.auto_combine(ds)
```
#### Output of ``xr.show_versions()``
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-43-Microsoft
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
xarray: 0.10.2
pandas: 0.22.0
numpy: 1.14.2
scipy: 1.0.0
netCDF4: 1.3.1
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: None
cyordereddict: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
setuptools: 38.5.1
pip: 10.0.1
conda: None
pytest: 3.4.2
IPython: 6.2.1
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2095/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue