id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
117039129,MDU6SXNzdWUxMTcwMzkxMjk=,659,groupby very slow compared to pandas,1322974,closed,0,,,9,2015-11-16T02:43:57Z,2022-05-15T02:38:30Z,2022-05-15T02:38:30Z,CONTRIBUTOR,,,,"```
import timeit
import numpy as np
from pandas import DataFrame
from xray import Dataset, DataArray

df = DataFrame({""a"": np.r_[np.arange(500.), np.arange(500.)],
                ""b"": np.arange(1000.)})
print(timeit.repeat('df.groupby(""a"").agg(""mean"")', globals={""df"": df}, number=10))
print(timeit.repeat('df.groupby(""a"").agg(np.mean)', globals={""df"": df, ""np"": np}, number=10))

ds = Dataset({""a"": DataArray(np.r_[np.arange(500.), np.arange(500.)]),
              ""b"": DataArray(np.arange(1000.))})
print(timeit.repeat('ds.groupby(""a"").mean()', globals={""ds"": ds}, number=10))
```

This outputs

```
[0.010462284000823274, 0.009770361997652799, 0.01081446700845845]
[0.02622630601399578, 0.024328112005605362, 0.018717073995503597]
[2.2804569930012804, 2.1666158599982737, 2.2688316510029836]
```

i.e. xray's groupby is ~100 times slower than pandas' one (and 200 times slower than passing `""mean""` to pandas' groupby, which I assume involves some specialization).

(This is the actual order or magnitude of the data size and redundancy I want to handle, i.e. thousands of points with very limited duplication.)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/659/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
111795064,MDU6SXNzdWUxMTE3OTUwNjQ=,627,string coordinate gets converted to object coordinate upon addition of variable to dataset,1322974,closed,0,,,10,2015-10-16T09:29:58Z,2021-03-27T21:19:33Z,2021-03-27T21:19:33Z,CONTRIBUTOR,,,,"With the current HEAD, consider

```
import numpy as np
from xray import *

ds = Dataset({""1"": DataArray(np.zeros(3), dims=[""a""], coords={""a"": list(""xyz"")})})
print(ds)
ds[""2""] = DataArray(np.zeros(2), dims=[""a""], coords={""a"": list(""xy"")})
print(ds)
```

This outputs

```
<xray.Dataset>
Dimensions:  (a: 3)
Coordinates:
  * a        (a) <U1 'x' 'y' 'z'
Data variables:
    1        (a) float64 0.0 0.0 0.0
<xray.Dataset>
Dimensions:  (a: 3)
Coordinates:
  * a        (a) object 'x' 'y' 'z'
Data variables:
    1        (a) float64 0.0 0.0 0.0
    2        (a) float64 0.0 0.0 nan
```

Note that the dtype of the `a` coordinate got changed after the assignment.

Python3.5, numpy 1.10.1, xray master (6ea7eb2b388075cc838c5ddf0ddaa47020cfcb89)

With 0.6.0 the coordinate is of object dtype both before and after.  I forgot why I tried master but I must have had a good reason...
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/627/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
125708367,MDU6SXNzdWUxMjU3MDgzNjc=,712,DataArrays should display their coordinates in the natural order,1322974,open,0,,,13,2016-01-08T22:33:05Z,2020-11-06T18:48:54Z,,CONTRIBUTOR,,,,"Consider

```
from collections import *
import numpy as np
from xray import *

d1 = DataArray(np.empty((2, 2)), coords=OrderedDict([(""foo"", [0, 1]), (""bar"", [0, 1])]))
d2 = DataArray(np.empty((2, 2)), coords=OrderedDict([(""bar"", [0, 1]), (""foo"", [0, 1])]))

ds = Dataset({""d1"": d1, ""d2"": d2})

print(ds.d1)
print(ds.d2)
```

This outputs

```
<xray.DataArray 'd1' (foo: 2, bar: 2)>
array([[  6.91516848e-310,   1.64244654e-316],
       [  6.91516881e-310,   6.91516881e-310]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1
<xray.DataArray 'd2' (bar: 2, foo: 2)>
array([[  1.59987863e-316,   6.91516883e-310],
       [  6.91515690e-310,   2.12670320e-316]])
Coordinates:
  * foo      (foo) int64 0 1
  * bar      (bar) int64 0 1
```

I understand that internally both DataArrays use the same coords object and thus the same coords order, but it would be helpful if, when printing d2 by itself, the coordinates were printed in the natural order (""bar"", ""foo"").  In particular, when working interactively, the list of coordinates at the end of the repr is the most easy thing to spot, and thus most helpful to know how to format the call to `array.loc[...]`.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/712/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
112254767,MDU6SXNzdWUxMTIyNTQ3Njc=,631,Confusing error (or lack thereof) when coordinate and variable share the same name,1322974,open,0,,,5,2015-10-19T23:39:22Z,2019-04-19T15:39:55Z,,CONTRIBUTOR,,,,"It probably makes sense to prevent dataset to have variables sharing the names of coordinates (what would `dataset.varname` return?) but currently

```
Dataset({""a"": DataArray(np.zeros((3, 4)), dims=[""a"", ""b""],
                        coords={""a"": list(""xyz""), ""b"": list(""xyzt"")})})
```

fails with `ValueError: an index variable must be defined with 1-dimensional data`, and

```
Dataset({""a"": DataArray(np.zeros(3), coords={""a"": list(""xyz"")})})
```

actually creates an empty dataset using `[0, 0, 0]` as values for the `a` coordinate instead of `x y z`:

```
<xray.Dataset>
Dimensions:  (a: 3)
Coordinates:
  * a        (a) float64 0.0 0.0 0.0
Data variables:
    *empty*
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/631/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
112253425,MDU6SXNzdWUxMTIyNTM0MjU=,630,Whether a DataArray is copied when inserted into a Dataset depends on whether coordinates match exactly,1322974,open,0,,,16,2015-10-19T23:27:15Z,2019-01-31T18:40:58Z,,CONTRIBUTOR,,,,"Consider

```
import numpy as np
from xray import *

ds = Dataset({""a"": DataArray(np.zeros((3, 4)))})
ds[""b""] = b = DataArray(np.zeros((3, 4)))
b[0, 0] = 1
print(ds[""b""][0, 0]) # ==> prints 1

ds = Dataset({""a"": DataArray(np.zeros((3, 4)))})
ds[""b""] = b = DataArray(np.zeros((3, 3)))  # !!! we implicitly fill the last column with nans.
b[0, 0] = 1
print(ds[""b""][0, 0]) # ==> prints 0
```

In the first case, the dataset was modified when the dataarray was modified, but not in the second case.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/630/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
114732169,MDU6SXNzdWUxMTQ3MzIxNjk=,643,"""naive"" iteration is very slow",1322974,closed,0,,,2,2015-11-03T02:53:04Z,2019-01-15T21:09:07Z,2019-01-15T21:09:07Z,CONTRIBUTOR,,,,"```
$ ipython
Python 3.5.0 (default, Sep 20 2015, 11:28:25) 
Type ""copyright"", ""credits"" or ""license"" for more information.

IPython 4.0.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.
Using matplotlib backend: Qt4Agg

In [1]: from xray import DataArray

# Iteration over a Python list
In [2]: %%timeit t = list(range(10000))
for _ in t: pass
   ...: 
10000 loops, best of 3: 87.3 µs per loop

# Iteration over a ndarray
In [3]: %%timeit t = np.arange(10000)
for _ in t: pass
   ...: 
1000 loops, best of 3: 472 µs per loop

# Iteration over a DataArray
In [4]: %%timeit t = DataArray(np.arange(10000))
for _ in t: pass
   ...: 
1 loops, best of 3: 818 ms per loop
```

I'm not sure how much can be done about this as iterating over a DataArray needs to create a bunch of temporary objects (and I understand the emphasis is as usual on vectorized operations, etc.) but a >1500 fold difference certainly doesn't look good.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/643/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
170458908,MDU6SXNzdWUxNzA0NTg5MDg=,958,Test failure with matplotlib 2.0b3,1322974,closed,0,,,1,2016-08-10T16:21:16Z,2018-10-26T23:12:28Z,2018-10-26T23:12:28Z,CONTRIBUTOR,,,,"mpl 2.0b3 / xarray HEAD
Arch Linux, Python 3.5.2

```
============================================================================================= FAILURES =============================================================================================
____________________________________________________________________________________ TestPlot.test_subplot_kws _____________________________________________________________________________________

self = <xarray.test.test_plot.TestPlot testMethod=test_subplot_kws>

    def test_subplot_kws(self):
        a = easy_array((10, 15, 4))
        d = DataArray(a, dims=['y', 'x', 'z'])
        d.coords['z'] = list('abcd')
        g = d.plot(x='x', y='y', col='z', col_wrap=2, cmap='cool',
                   subplot_kws=dict(axisbg='r'))
        for ax in g.axes.flat:
>           self.assertEqual(ax.get_axis_bgcolor(), 'r')

xarray/test/test_plot.py:148: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <xarray.test.test_plot.TestPlot testMethod=test_subplot_kws>, a1 = (1.0, 0.0, 0.0, 1), a2 = 'r'

    def assertEqual(self, a1, a2):
>       assert a1 == a2 or (a1 != a1 and a2 != a2)
E       AssertionError: assert ((1.0, 0.0, 0.0, 1) == 'r' or ((1.0, 0.0, 0.0, 1) != (1.0, 0.0, 0.0, 1)))

xarray/test/__init__.py:164: AssertionError
--------------------------------------------------------------------------------------- Captured stderr call ---------------------------------------------------------------------------------------
/usr/lib/python3.5/site-packages/matplotlib/cbook.py:137: MatplotlibDeprecationWarning: The axisbg attribute was deprecated in version 2.0. Use facecolor instead.
  warnings.warn(message, mplDeprecation, stacklevel=1)
/home/antony/src/extern/xarray/xarray/test/test_plot.py:148: MatplotlibDeprecationWarning: The get_axis_bgcolor function was deprecated in version 2.0. Use get_facecolor instead.
  self.assertEqual(ax.get_axis_bgcolor(), 'r')
```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/958/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
117297089,MDExOlB1bGxSZXF1ZXN0NTA5MTEzMzQ=,661,Document pandas' better groupby performance.,1322974,closed,0,,,1,2015-11-17T07:04:50Z,2015-11-17T09:10:04Z,2015-11-17T08:54:31Z,CONTRIBUTOR,,0,pydata/xarray/pulls/661,"cf. #659.
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/661/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull