issue_comments: 442636798

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/1603#issuecomment-442636798	https://api.github.com/repos/pydata/xarray/issues/1603	442636798	MDEyOklzc3VlQ29tbWVudDQ0MjYzNjc5OA==	5635139	2018-11-28T22:54:26Z	2018-11-28T22:54:26Z	MEMBER	Potentially this is too much 'stepping back' now we're at the implementation stage - my perception is that @shoyer is leading this without much support, so weighting having some additional viewpoints, some questions: Is a MultiIndex a feature of the schema or the implementation? I had thought of an MI being an implementation detail in code, rather than in the data schema. We use it as a container for all the indexes along a dimension, rather than representing any properties about the data it contains. One exception to that would be if we wanted multiple groups of indexes along the same dimension, for example: ``` Coordinates: * xa (x) MultiIndex[level_a_1, level_a_2] * level_a_1 (x) object 'a' 'a' 'b' 'b' * level_a_2 (x) int64 1 2 1 2 xb (x) MultiIndex[level_b_1, level_b_2] level_b_1 (x) object 'a' 'a' 'b' 'b' level_b_2 (x) int64 1 2 1 2 ``` But is that common / required? MultiIndex as an implementation detail If it's an implementation detail, is there a benefit to investing in allowing both separate and MIs? While it may not be possible to do pointwise indexing with the current implementation of MI, am I mistaken that it's not an API issue, assuming we pass in index names? e.g.: ```python [ins] In [22]: da = xr.DataArray(np.arange(12).reshape((3, 4)), dims=['x', 'y'], coords=dict(x=list('abc'), y=pd.MultiIndex.from_product([list('ab'),[1,2]]))) [ins] In [23]: da Out[23]: <xarray.DataArray (x: 3, y: 4)> array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) Coordinates: * x (x) <U1 'a' 'b' 'c' * y (y) MultiIndex - y_level_0 (y) object 'a' 'a' 'b' 'b' - y_level_1 (y) int64 1 2 1 2 [ins] In [26]: da.sel(x=xr.DataArray(['a','c'],dims=['z']), y_level_0=xr.DataArray(['a','b'],dims=['z']) y_level_1=xr.DataArray([1,1],dims=['z'])) Out[80]: # hypothetical <xarray.DataArray (z: 3)> array([ 0, 10]) Dimensions without coordinates: z ``` If that's the case, could we instead force all indexes along a dimension to be in a MI, tolerate the short-term constraints of the current MI implementation, and where needed build out additional features? That would (ideally) leave us uncoupled to MIs - if we built a better in-memory data structure, we could transition. The contract would be around the cases above. -- ...and as mentioned above, these are intended as questions rather than high-confident views.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		262642978