html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/4186#issuecomment-652032780,https://api.github.com/repos/pydata/xarray/issues/4186,652032780,MDEyOklzc3VlQ29tbWVudDY1MjAzMjc4MA==,1217238,2020-06-30T20:44:00Z,2020-06-30T20:44:00Z,MEMBER,"> > My concern was when another person works on this and didn't get the context that `idx` might be different from `dataframe.index` and new bugs could potentially be introduced
> 
> Let me see if I can rewrite the helper functions to avoid passing around a `DataFrame`

This was a good suggestion. Done in https://github.com/pydata/xarray/pull/4184/commits/96b544b5a59894359a35680151af71c0226f0505","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-652018527,https://api.github.com/repos/pydata/xarray/issues/4186,652018527,MDEyOklzc3VlQ29tbWVudDY1MjAxODUyNw==,1217238,2020-06-30T20:13:44Z,2020-06-30T20:13:44Z,MEMBER,"> My concern was when another person works on this and didn't get the context that `idx` might be different from `dataframe.index` and new bugs could potentially be introduced

Let me see if I can rewrite the helper functions to avoid passing around a `DataFrame`","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651905098,https://api.github.com/repos/pydata/xarray/issues/4186,651905098,MDEyOklzc3VlQ29tbWVudDY1MTkwNTA5OA==,1217238,2020-06-30T16:29:10Z,2020-06-30T16:44:02Z,MEMBER,"@Li9htmare I'm not sure I follow your example. #4184 does remove the use of `DataFrame.set_index()`, but it also removes any subsequent use of `dataframe.index` -- it always uses the separately processed index.

Is there something specific that you are worried about going wrong with your latest example? For what it's worth, here's what `to_xarray()` does with the current version of #4184:
```
In [4]: df.to_xarray()
Out[4]:
<xarray.Dataset>
Dimensions:  (lev1: 2, lev2: 1)
Coordinates:
  * lev1     (lev1) object 'b' 'a'
  * lev2     (lev2) object 'foo'
Data variables:
    C1       (lev1, lev2) int64 0 2
    C2       (lev1, lev2) int64 1 3

In [5]: df.to_xarray().indexes
Out[5]:
lev1: CategoricalIndex(['b', 'a'], categories=['b', 'a'], ordered=True, name='lev1', dtype='category')
lev2: Index(['foo'], dtype='object', name='lev2')
```

I *think* this is doing the right thing already?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651467248,https://api.github.com/repos/pydata/xarray/issues/4186,651467248,MDEyOklzc3VlQ29tbWVudDY1MTQ2NzI0OA==,1217238,2020-06-30T01:41:36Z,2020-06-30T01:41:36Z,MEMBER,"The sorting seems to be a separate matter, caused by `dataframe.set_index()` inside our `remove_unused_levels_categories` function. I think we can remove that, which will fix the sorting issue when removing unused levels. Then the result will be the desired:
```
df.to_xarray()
 <xarray.Dataset>
Dimensions:  (lev1: 2, lev2: 1)
Coordinates:
  * lev1     (lev1) object 'b' 'a'
  * lev2     (lev2) object 'foo'
Data variables:
    C1       (lev1, lev2) int64 0 2
    C2       (lev1, lev2) int64 1 3
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651458105,https://api.github.com/repos/pydata/xarray/issues/4186,651458105,MDEyOklzc3VlQ29tbWVudDY1MTQ1ODEwNQ==,1217238,2020-06-30T01:14:45Z,2020-06-30T01:14:45Z,MEMBER,"Actually, I realize now that this is basically the same issue as https://github.com/pydata/xarray/issues/2619

If I remove the use of `removed_unused_levels_categories` from `from_dataframe`, then I get the same behavior that we considered a bug in that issue:
```
In [5]: ds.isel(xy=ds['x'] < 4).to_pandas().to_xarray()
Out[5]:
<xarray.DataArray (x: 8, y: 5)>
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.],
       [15., 16., 17., 18., 19.],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan]])
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7
  * y        (y) int64 0 1 2 3 4
```

So maybe it is more consistent to keep calling `remove_unused_levels()`, which somewhat surprisingly sorts MultiIndex levels.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651454795,https://api.github.com/repos/pydata/xarray/issues/4186,651454795,MDEyOklzc3VlQ29tbWVudDY1MTQ1NDc5NQ==,6815844,2020-06-30T01:06:34Z,2020-06-30T01:06:34Z,MEMBER,I agree that it's better not to sort.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651453863,https://api.github.com/repos/pydata/xarray/issues/4186,651453863,MDEyOklzc3VlQ29tbWVudDY1MTQ1Mzg2Mw==,1217238,2020-06-30T01:03:40Z,2020-06-30T01:03:40Z,MEMBER,"I verified that #4184 fixes the tests added for #3953 even after removing the call to `remove_unused_levels_categories()`.

The main question is what behavior we want to do have: Should `from_dataframe` preserve index levels exactly, or should it sort them first?

I think it's better to not to sort (but of course it's better to sort than to get the wrong order).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651438776,https://api.github.com/repos/pydata/xarray/issues/4186,651438776,MDEyOklzc3VlQ29tbWVudDY1MTQzODc3Ng==,6815844,2020-06-30T00:21:43Z,2020-06-30T00:21:43Z,MEMBER,"I think the #3953 fixes the case where the multiindex has unused levels.
I had no better idea than #3953, but if it works without #3953, it would be better ;)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651428394,https://api.github.com/repos/pydata/xarray/issues/4186,651428394,MDEyOklzc3VlQ29tbWVudDY1MTQyODM5NA==,1217238,2020-06-29T23:51:49Z,2020-06-29T23:51:49Z,MEMBER,"Thanks for clarifying!

This raises an interesting question for #4184: do we want to keep @fujiisoup's fix from #3953 or not?

If we remove @fujiisoup's fix, then the output we see is:
```
df.to_xarray()
 <xarray.Dataset>
Dimensions:  (lev1: 2, lev2: 1)
Coordinates:
  * lev1     (lev1) object 'b' 'a'
  * lev2     (lev2) object 'foo'
Data variables:
    C1       (lev1, lev2) int64 0 2
    C2       (lev1, lev2) int64 1 3
```

This is also *correct* -- coordinates match up with values -- but the order of the result is different from what is currently on master.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560
https://github.com/pydata/xarray/issues/4186#issuecomment-651402838,https://api.github.com/repos/pydata/xarray/issues/4186,651402838,MDEyOklzc3VlQ29tbWVudDY1MTQwMjgzOA==,1217238,2020-06-29T22:28:00Z,2020-06-29T22:28:00Z,MEMBER,"Hi @pzhlobi @Li9htmare -- thanks for raising this issue.

Could you kindly clarify for me exactly what behavior you think xarray *should* do? The results are indeed reordered currently, but as far as I can tell the pairing between coordinators and values remains consistent.

When I test this myself, I see the same behavior (documented in the first post) either with or without my changes from #4184.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,646716560