html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/906#issuecomment-269507466,https://api.github.com/repos/pydata/xarray/issues/906,269507466,MDEyOklzc3VlQ29tbWVudDI2OTUwNzQ2Ng==,1217238,2016-12-28T17:09:23Z,2016-12-28T17:09:23Z,MEMBER,"@crusaderky can you raise the issue again on the pandas issue tracker (see my comment in https://github.com/pandas-dev/pandas/issues/14903#issuecomment-267779151)? If need be, we can change this separately, but all things being equal I would prefer to keep `unstack()` consistent between pandas and xarray.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490
https://github.com/pydata/xarray/issues/906#issuecomment-234686759,https://api.github.com/repos/pydata/xarray/issues/906,234686759,MDEyOklzc3VlQ29tbWVudDIzNDY4Njc1OQ==,1217238,2016-07-23T00:24:17Z,2016-07-23T00:24:17Z,MEMBER,"@crusaderky gist.github.com will render ipynb files, which makes them much easier to view!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490
https://github.com/pydata/xarray/issues/906#issuecomment-233994941,https://api.github.com/repos/pydata/xarray/issues/906,233994941,MDEyOklzc3VlQ29tbWVudDIzMzk5NDk0MQ==,1217238,2016-07-20T15:58:15Z,2016-07-20T15:58:15Z,MEMBER,"Here are two examples where we would need to do pick-by-index on the data no matter what:

``` python
def demo_unstack(index):
    index = pandas.MultiIndex.from_tuples(index, names=['x', 'count'])
    s = pandas.Series(list(range(len(index))), index)
    print(s.unstack())
```

There is no order for one or more of the levels would be sorted:

``` python
demo_unstack([
    ['x0', 'first' ],
    ['x0', 'second'],
    ['x0', 'third' ],
    ['x1', 'third' ],
    ['x1', 'second'],
    ['x1', 'first' ],
])
```

```
count  first  second  third
x                          
x0         0       1      2
x1         5       4      3
In [ ]:
```

Even more pathological: the multi-index doesn't even fill out every value in the cartesian product:

``` python
demo_unstack([
    ['x1', 'first' ],
    ['x1', 'second'],
    ['x0', 'first' ],
])
```

```
count  first  second
x                   
x0       2.0     NaN
x1       0.0     1.0
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490
https://github.com/pydata/xarray/issues/906#issuecomment-233797167,https://api.github.com/repos/pydata/xarray/issues/906,233797167,MDEyOklzc3VlQ29tbWVudDIzMzc5NzE2Nw==,1217238,2016-07-19T23:29:57Z,2016-07-19T23:29:57Z,MEMBER,"> You're basically doing a pick-by-index rebuild of the array, which does potentially random access to the whole input array - thus nullifying the benefits of the CPU cache. This is compared to a numpy.ndarray.reshape(), which has the cost of a memcpy().

This is true, but in the worst case (e.g., random order for the MultiIndex) we'll have this issue no matter what rule we pick for assigning unstacked coordinates.

> I was going to add something about doing pick-by-index with a dask array will be even worse, when I realised that multiindex does not work at all when you chunk()... :(

MultiIndex _should_ work with dask -- we have a few tests for this. If not, a bug report would be appreciated!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490
https://github.com/pydata/xarray/issues/906#issuecomment-233796557,https://api.github.com/repos/pydata/xarray/issues/906,233796557,MDEyOklzc3VlQ29tbWVudDIzMzc5NjU1Nw==,1217238,2016-07-19T23:26:33Z,2016-07-19T23:26:33Z,MEMBER,"What behavior would you suggest as an alternative? I suppose that in principle we could assign new levels based on order of appearance (and treat `levels` as an implementation detail), but it's worth noting that this behavior for `unstack()` matches how pandas works:

```
>>> s.unstack()
count  first  fourth  second  third
x                                  
x0         4       7       5      6
x1         0       3       1      2
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490
https://github.com/pydata/xarray/issues/906#issuecomment-233776163,https://api.github.com/repos/pydata/xarray/issues/906,233776163,MDEyOklzc3VlQ29tbWVudDIzMzc3NjE2Mw==,1217238,2016-07-19T21:45:33Z,2016-07-19T21:45:33Z,MEMBER,"`unstack` sorts the data [by the order of labels](https://github.com/pydata/xarray/blob/7a9e84b5708d3e8ec270a7415f9b5e54d30f13f7/xarray/core/dataset.py#L1417) on the `levels` attribute on the MultiIndex. We don't calculate the order when calling `unstack`, so there shouldn't be any performance concerns on this side.

By default, pandas.MultiIndex creates each level in `levels` in sorted order, which is sometimes necessary to ensure indexing (especially slicing) works properly. But if you like, you can control this explicitly by using the [MultiIndex constructor](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.MultiIndex.html) directly, e.g., `index = pandas.MultiIndex(levels, labels)`. Does that solve your use case here?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,166439490