issue_comments: 442956167

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/1603#issuecomment-442956167	https://api.github.com/repos/pydata/xarray/issues/1603	442956167	MDEyOklzc3VlQ29tbWVudDQ0Mjk1NjE2Nw==	1217238	2018-11-29T19:10:14Z	2018-11-29T19:10:14Z	MEMBER	Looking at the reported issues related to multi-indexes in xarray, I have the same feeling. Simply reusing pandas.MultiIndex in xarray where slightly different semantics are generally expected has shown to be painful. It seems easier to have our own baked solution and deal with differences during xarray<-> pandas conversion if needed. I think the pandas.MultiIndex is a pretty solid data structure on a fundamental level, it just has some weird semantics for some indexing edge cases. Whether or not we write xarray.MultiIndex structure, we can achieve most of what we want with a thin layer over `pandas.MultiIndex`. If a variable for each multi-coordinate index is "just" for data schema consistency, then why not showing all those indexes in a separate section of the repr? Yes, I like this! Generally I like @benbovy's entire proposal :). @fujiisoup can you clarity the use-cases you have for a MultiIndex as a variable? Am I right in thinking the Multi-indexes is only a helpful note to users, rather than conveying anything about how data is accessed? From a data perspective, the only thing having an Index and/or MultiIndex should change is that the data is immutable. But by necessity the nature of the index will determine which indexing operations are possible/efficient. For example, if you want to do nearest-neighbor indexing with multiple coordinates you'll need a KDTree. We should not be afraid to raise errors if an indexing operation can't be done efficiently. With regards to reindexing: I don't think this needs any special handling versus normal indexing (`sel()`). The rules basically fall out of those for normal indexing, except we handle missing values differently (by filling with NaN). Another issue: how do automatic alignment with multiple indexes? Let me suggest a straw-man proposal: We always align indexed coordinates. If a coordinate is used in different types of indexes (e.g., a base `Index` in one argument and a `MultiIndex` level in another), we can either: 1. create a `MultiIndex` with the variable on the fly (this could be slightly expensive), or 2. fall back to only supporting "exact" indexing	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		262642978