html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/1412#issuecomment-303984274,https://api.github.com/repos/pydata/xarray/issues/1412,303984274,MDEyOklzc3VlQ29tbWVudDMwMzk4NDI3NA==,6815844,2017-05-25T11:04:55Z,2017-05-25T11:04:55Z,MEMBER,Replaced by a new PR #1426 .,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302919742,https://api.github.com/repos/pydata/xarray/issues/1412,302919742,MDEyOklzc3VlQ29tbWVudDMwMjkxOTc0Mg==,6815844,2017-05-21T07:13:37Z,2017-05-21T07:13:37Z,MEMBER,"@benbovy Thanks for the valuable comments. Actually I can not fully imagine how the actual implementation looks like currently, but I also think the virtual variable access needs some tricks. This is an essential functionality of the MultiIndex-coordinate, I will try to investigate it. Thanks.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302883780,https://api.github.com/repos/pydata/xarray/issues/1412,302883780,MDEyOklzc3VlQ29tbWVudDMwMjg4Mzc4MA==,4160723,2017-05-20T16:31:43Z,2017-05-20T16:31:43Z,MEMBER,"I also agree that a MultiIndex wrapper would be to be the way to go. I think I understand the diagrams, but what remains a bit unclear to me is how this could be implemented. In particular, how would this wrapper work with `IndexVariable`? Currently, `IndexVariable` warps either a `pandas.Index` or a `pandas.MultiIndex` and for the latter case `IndexVariable.get_level_variable` can generate new `IndexVariable` objects so that MultiIndex levels are accessible as ""virtual coordinates"". Would `IndexVariable` warp a MultiIndex wrapper instead (levels + scalars), and also be able to generate new scalar `Variable` objects that will be accessible as virtual coordinates? This is maybe slightly off topic, but more generally I'm also wondering how this would co-exist with potentially other kinds of multi-level indexes (see this [comment](https://github.com/pydata/xarray/issues/475#issuecomment-241821366)). ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302878939,https://api.github.com/repos/pydata/xarray/issues/1412,302878939,MDEyOklzc3VlQ29tbWVudDMwMjg3ODkzOQ==,1217238,2017-05-20T15:11:24Z,2017-05-20T15:11:24Z,MEMBER,"@fujiisoup Yes, the solution of writing a MultiIndex wrapper for xarray looks much cleaner to me. I like the look of this proposal! (Those diagrams are also very helpful) I guess this could be implemented as a `pandas.MultiIndex` along with a list of scalar coordinates?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302693433,https://api.github.com/repos/pydata/xarray/issues/1412,302693433,MDEyOklzc3VlQ29tbWVudDMwMjY5MzQzMw==,6815844,2017-05-19T12:47:28Z,2017-05-19T12:47:28Z,MEMBER,"I also agree. It seems too magical. But I slightly changed my mind. I notice what I really want to have is not particular scalar coordinate in MultiIndex, but 'unified' interface between normal `Vraiable` and `MultiIndex`. The current structure is illustrated as follows, ![current_coord](https://cloud.githubusercontent.com/assets/6815844/26248226/5eb1a958-3cdc-11e7-920a-aead58be2b49.png) The `MultiIndex` has different characteristics from normal `Variable`. For example, if we do `ds.sel(x=2)`, it makes a scalar coordinate and normal Variable. The backward process might be `.expand_dims().stack()`. This is different from normal `Variable` behavior. And because of it, MultiIndex should be treated in special way in every place. (Deprecating the automatic-renaming does not change things so much.) I am wondering if we could have the following class structure things become simpler ![my_proposal](https://cloud.githubusercontent.com/assets/6815844/26248233/64ece2ce-3cdc-11e7-8897-893cfb5f359b.png) In this picture, MultiIndex can have `scalar` as its level and `.isel()` produces it. This process can be traced backward by `.expand_dims()` or `.concat()` as in normal `Variable`. I understand it is different from `pandas.MultiIndex` structure, and we need to expand our wrapper extensively if we decide to realize it (as written in red). But I feel this symmetric structure could make it easy to expand `MultiIndex` functionalities in future. Any thoughts are welcome. (Should move discussion to another issue?)","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302425186,https://api.github.com/repos/pydata/xarray/issues/1412,302425186,MDEyOklzc3VlQ29tbWVudDMwMjQyNTE4Ng==,1217238,2017-05-18T14:42:57Z,2017-05-18T14:42:57Z,MEMBER,"> To me it looks like it is a bit too magical, but just wondering what you think... Agreed, this also seems too magical to me.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302358284,https://api.github.com/repos/pydata/xarray/issues/1412,302358284,MDEyOklzc3VlQ29tbWVudDMwMjM1ODI4NA==,4160723,2017-05-18T09:56:28Z,2017-05-18T09:58:06Z,MEMBER,"A possible direction to reduce the `if` statements in many different places would be to just return `pos_indexers` in `indexing.remap_level_indexers` - as it was the case before adding multi-index support - and instead put in `Dataset.isel` all the logic for checking `MultiIndex` and maybe convert it to `Index` and/or scalar coordinates and maybe rename dimension. This would simplify many things, although I haven't thought about about all other possible issues it would create (perfomance, etc.). Also, `DataArray.loc` doesn't seem to use `Dataset.isel`. Here is another case related to this PR. From the example in the linked issue, the current behavior is ```python In [9]: ds.isel(yx=[0, 1]) Out[9]: Dimensions: (yx: 2) Coordinates: * yx (yx) MultiIndex - y (yx) object 'a' 'a' - x (yx) int64 1 2 Data variables: foo (yx) int64 1 2 ``` Do we want to also change the behavior to this? ```python In [10]: ds.isel(yx=[0, 1]) Out[10]: Dimensions: (x: 2) Coordinates: * x (x) int64 1 2 y object 'a' Data variables: foo (x) int64 1 2 ``` To me it looks like it is a bit too magical, but just wondering what you think... ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302299557,https://api.github.com/repos/pydata/xarray/issues/1412,302299557,MDEyOklzc3VlQ29tbWVudDMwMjI5OTU1Nw==,6815844,2017-05-18T04:48:29Z,2017-05-18T04:48:29Z,MEMBER,"@shoyer Thanks for the comment. > It breaks an important invariant, which is that indexing a Variable returns another Variable. I totally agree with you. In the last commit, I moved the unpacking functionality into `Dataset`, and restored the modification in `Variable` class I made. I think the current is cleaner than my previous one, but I'm not yet comfortable with it. There are a lot of functions or `if`-statements related to `MultiIndex` in different places. I guess they should be bundled in one place. Adding functions is easy but simplifying them are difficult... If anyone show a direction, I will try the improvement.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997 https://github.com/pydata/xarray/pull/1412#issuecomment-302170558,https://api.github.com/repos/pydata/xarray/issues/1412,302170558,MDEyOklzc3VlQ29tbWVudDMwMjE3MDU1OA==,1217238,2017-05-17T17:42:58Z,2017-05-17T17:42:58Z,MEMBER,"> variable.__getitem__ now returns an OrderedDict if a single element is selected from MultiIndex. I don't like this change. It breaks an important invariant, which is that indexing a Variable returns another Variable. I do agree with indexing along a MultiIndex dimension should unpacking the tuple for coordinates, but only for coordinates. So this needs to be somewhere in the `Dataset.isel` logic, not `Variable.isel`. Consider indexing `ds['yx']` from your example in the linked issue. With the current version of xarray: ``` In [7]: ds['yx'] Out[7]: array([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3)], dtype=object) Coordinates: * yx (yx) MultiIndex - y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' - x (yx) int64 1 2 3 1 2 3 In [8]: ds['yx'][0] Out[8]: array(('a', 1), dtype=object) Coordinates: yx object ('a', 1) ``` We want to change the indexing behavior to this: ``` In [8]: ds['yx'][0] Out[8]: array(('a', 1), dtype=object) Coordinates: y object 'a' x int64 1 ``` But we don't want to change what happens to the DataArray itself -- it should still be a scalar object array. I tested this example on your PR branch, and it actually crashes with `KeyError`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,229370997