html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/767#issuecomment-247031135,https://api.github.com/repos/pydata/xarray/issues/767,247031135,MDEyOklzc3VlQ29tbWVudDI0NzAzMTEzNQ==,4160723,2016-09-14T14:28:29Z,2016-09-14T14:28:29Z,MEMBER,"Fixed in #802 and #947. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-191826844,https://api.github.com/repos/pydata/xarray/issues/767,191826844,MDEyOklzc3VlQ29tbWVudDE5MTgyNjg0NA==,1217238,2016-03-03T16:00:50Z,2016-03-03T16:00:50Z,MEMBER,"If you try that doing that indexing with a pandas.Series, you actually get an error message: ``` In [71]: s.loc['bar', slice(4000, 4100.3)] # ... /Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis) 1363 # nested tuple slicing 1364 if is_nested_tuple(key, labels): -> 1365 locs = labels.get_locs(key) 1366 indexer = [ slice(None) ] * self.ndim 1367 indexer[axis] = locs /Users/shoyer/conda/envs/xarray-dev/lib/python3.5/site-packages/pandas/core/index.py in get_locs(self, tup) 5692 if not self.is_lexsorted_for_tuple(tup): 5693 raise KeyError('MultiIndex Slicing requires the index to be fully lexsorted' -> 5694 ' tuple len ({0}), lexsort depth ({1})'.format(len(tup), self.lexsort_depth)) 5695 5696 # indexer KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (0,) ``` I guess it's also worth investigating `get_locs` as an alternative or companion to `get_loc_level`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-191625512,https://api.github.com/repos/pydata/xarray/issues/767,191625512,MDEyOklzc3VlQ29tbWVudDE5MTYyNTUxMg==,4160723,2016-03-03T07:24:34Z,2016-03-03T07:24:34Z,MEMBER,"From this point of view I agree that `da.sel(band_wavenumber={'band': 'bar'})` is a nicer solution! I'll follow your suggestion of returning a new `pandas.Index` object from `convert_label_indexer`. Unless I miss a better solution, we can use `pandas.MultiIndex.get_loc_level` to get both the indexer and the new `pandas.Index` object. However, there may still be some advanced cases where it won't behave as expected. For example, selecting both the band 'bar' and a range of wavenumber values (that doesn't exactly match the range of that band) ``` da.sel(band_wavenumber={'band': 'bar', 'wavenumber': slice(4000, 4100.3)})` ``` will a-priori return a stacked `DataArray` with the full multi-index: ``` In [32]: idx = da.band_wavenumber.to_index() In [33]: idx.get_loc_level(('bar', slice(4000, 4100.3)), level=('band', 'wavenumber')) Out[33]: (array([False, False, True, True, False], dtype=bool), MultiIndex(levels=[['bar', 'foo'], [4050.2, 4050.3, 4100.1, 4100.3, 4100.5]], labels=[[0, 0], [2, 3]], names=['band', 'wavenumber'])) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-191315935,https://api.github.com/repos/pydata/xarray/issues/767,191315935,MDEyOklzc3VlQ29tbWVudDE5MTMxNTkzNQ==,1217238,2016-03-02T16:32:43Z,2016-03-02T16:32:43Z,MEMBER,"The good news about writing our own custom way to select levels is that because we can avoid the stack/unstack, we can simply omit unused levels without worrying about doing `dropna` with `unstack`. So as long as we are implementing this in own other method (e.g., `sel` or `xs`), we can default to `drop_level=True`. I would be OK with `xs`, but `da.xs('bar', dim='band_wavenumber', level='band')` feels much more verbose to me than `da.sel(band_wavenumber={'band': 'bar'})`. The later solution involves inventing no new API, and because dictionaries are not hashable there's no potential conflict with existing functionality. Last year at the SciPy conference sprints, @jonathanrocher was working on adding similar dictionary support into `.loc` in pandas (i.e., `da.loc[{'band': 'band'}]`). I don't think he ever finished up that PR, but he might have a branch worth looking at as a starting point. > I think that this solution is better than, e.g., directly providing index level names as arguments of the sel method. This may be confusing and there may be conflict when different dimensions have the same index level names. This is a fair point, but such scenarios are unlikely to appear in practice. We might be able to, for example, update our handling of MultiIndexes to guarantee that level names cannot conflict with other variables. This might be done by inserting dummy-variables of some sort into the `_coords` dict whenever a MultiIndex is added. It would take some work to ensure this works smoothly, though. > Besides this, It would be nice if the drop_level=True behavior could be applied by default to any selection (i.e., also when using loc, sel, etc.), like with Pandas. I don't know how Pandas does this (I'll look into that), but at first glance this would here imply checking for each dimension if it has a multi-index and then checking the labels for each index level. Yes, agreed. Unfortunately the pandas code that handles this is a complete mess of spaghetti code (see pandas/core/indexers.py). So are welcome to try decoding it, but in my opinion you might be better off starting from scratch. In xarray, the function [convert_label_indexer](https://github.com/pydata/xarray/blob/660d29d33cfc647324543e338d0b03ce7be1383b/xarray/core/indexing.py#L138) would need an updated interface that allows it to possibly return a new `pandas.Index` object to replace the existing index. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-191303962,https://api.github.com/repos/pydata/xarray/issues/767,191303962,MDEyOklzc3VlQ29tbWVudDE5MTMwMzk2Mg==,4160723,2016-03-02T16:05:31Z,2016-03-02T16:05:31Z,MEMBER,"OK, I've read more carefully the [discussion](https://github.com/pydata/xarray/pull/702#issuecomment-171407924) you referred to, and now I understand why it is preferable to call `dropna` explicitely. My last suggestion above is not compatible with this. The `xs` method (not sure about the name) may still provide a concise way to perform a selection with explicit unstack and dropna. Maybe it is more appropriate to use `dropna` instead of `drop_level`: ``` python da.xs('bar', dim='band_wavenumber', level='band', dropna=True) ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-191239041,https://api.github.com/repos/pydata/xarray/issues/767,191239041,MDEyOklzc3VlQ29tbWVudDE5MTIzOTA0MQ==,4160723,2016-03-02T13:31:36Z,2016-03-02T13:31:36Z,MEMBER,"Thinking about this issue, I'd like to know what you think of the suggestions below before considering any pull request. The following line code gives the same result than in my previous comment, but it is more explicit and shorter: ``` python da.unstack('band_wavenumber').sel(band='bar').dropna('wavenumber', how='any') ``` A nice shortcut to this would be adding a new `xs` method to `DataArray` and `Dataset`, which would be quite similar to the `xs` method of Pandas but here with an additional `dim` keyword argument: ``` python da.xs('bar', dim='band_wavenumber', level='band', drop_level=True) ``` Like Pandas, the default value of `drop_level` would be `True`. But here `drop_level` rather sets whether or not to apply `dropna` to all (unstacked) index levels of `dim` except the specified `level`. I think that this solution is better than, e.g., directly providing index level names as arguments of the `sel` method. This may be confusing and there may be conflict when different dimensions have the same index level names. Another, though less elegant, solution would be to provide dictionnaries to the `sel` method: ``` python da.sel(band_wavenumber={'band': 'bar'}) ``` Besides this, It would be nice if the `drop_level=True` behavior could be applied by default to any selection (i.e., also when using `loc`, `sel`, etc.), like with Pandas. I don't know how Pandas does this (I'll look into that), but at first glance this would here imply checking for each dimension if it has a multi-index and then checking the labels for each index level. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,134359597 https://github.com/pydata/xarray/issues/767#issuecomment-185826486,https://api.github.com/repos/pydata/xarray/issues/767,185826486,MDEyOklzc3VlQ29tbWVudDE4NTgyNjQ4Ng==,4160723,2016-02-18T17:31:29Z,2016-02-18T17:31:29Z,MEMBER,"Thanks for the tip. So I finally obtain the desired result when selecting the band 'bar' by doing this: ``` In [21]: (da.sel(band_wavenumber='bar') ...: .unstack('band_wavenumber') ...: .dropna('band', how='all') ...: .dropna('wavenumber', how='any') ...: .sel(band='bar')) Out[21]: array([ 1.20000000e-04, 1.00000000e-04, 8.50000000e-05]) Coordinates: band