html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/802#issuecomment-233596724,https://api.github.com/repos/pydata/xarray/issues/802,233596724,MDEyOklzc3VlQ29tbWVudDIzMzU5NjcyNA==,4160723,2016-07-19T10:48:56Z,2016-07-19T10:48:56Z,MEMBER,"Thanks again @shoyer for your reviews and comments!
Merging this PR broke the tests added in #881. I fixed that in #903.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-233504589,https://api.github.com/repos/pydata/xarray/issues/802,233504589,MDEyOklzc3VlQ29tbWVudDIzMzUwNDU4OQ==,1217238,2016-07-19T01:15:34Z,2016-07-19T01:15:34Z,MEMBER,"@tippetts Thanks for checking in. I actually hadn't noticed the last round of changes here, but I think this is good to go in now! Big thanks to @benbovy!!!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-233482724,https://api.github.com/repos/pydata/xarray/issues/802,233482724,MDEyOklzc3VlQ29tbWVudDIzMzQ4MjcyNA==,17055041,2016-07-18T22:50:50Z,2016-07-18T22:50:50Z,NONE,"Mind if I ask if this will get merged into master? It looks like a lot of work went into the pull request, and the discussion + passed checks lead me to believe it could be close to going in. Is there anything a third party can do to push it across the finish line?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-225330535,https://api.github.com/repos/pydata/xarray/issues/802,225330535,MDEyOklzc3VlQ29tbWVudDIyNTMzMDUzNQ==,1217238,2016-06-11T01:57:10Z,2016-06-11T01:57:10Z,MEMBER,"I did a little bit of testing. Here is one case I found where things don't work as I expected:
```
In [32]: idx = pd.MultiIndex.from_product([['a', 'b'], [0, 1, 2]], names=['lev0', 'lev1'])
In [33]: data = xr.DataArray(range(6), [('x', idx)])
In [34]: data.sel(x=('a',))
Out[34]:
array([0, 1, 2])
Coordinates:
* lev1 (lev1) int64 0 1 2
In [35]: data.sel(x='a')
Out[35]:
array([0, 1, 2])
Coordinates:
* x (x) object ('a', 0) ('a', 1) ('a', 2)
```
I would expect an index drop in the last case, too. I guess we need to check for scalars.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-225236223,https://api.github.com/repos/pydata/xarray/issues/802,225236223,MDEyOklzc3VlQ29tbWVudDIyNTIzNjIyMw==,4160723,2016-06-10T16:52:44Z,2016-06-10T16:52:44Z,MEMBER,"Thanks for your second review @shoyer ! It actually helped me to better understand the indexing logic of pandas (never too late)!
I made some updates according to your comments.
I think we're getting closer to a working feature!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224347054,https://api.github.com/repos/pydata/xarray/issues/802,224347054,MDEyOklzc3VlQ29tbWVudDIyNDM0NzA1NA==,1217238,2016-06-07T17:05:11Z,2016-06-07T17:05:11Z,MEMBER,"Just went through and gave another full review -- this is looking quite nice, just a few more things to clean up!
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224327201,https://api.github.com/repos/pydata/xarray/issues/802,224327201,MDEyOklzc3VlQ29tbWVudDIyNDMyNzIwMQ==,4160723,2016-06-07T15:58:49Z,2016-06-07T15:58:49Z,MEMBER,"Noted the difference in documentation.
Hope that the whole ""multi-level indexing"" section is clear enough, not sure if it's proper English.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224106148,https://api.github.com/repos/pydata/xarray/issues/802,224106148,MDEyOklzc3VlQ29tbWVudDIyNDEwNjE0OA==,1217238,2016-06-06T22:18:28Z,2016-06-06T22:18:28Z,MEMBER,"I don't think we need to handle it -- I'm definitely supportive of only supporting unambiguous syntax. Let's just note the difference from pandas.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224104390,https://api.github.com/repos/pydata/xarray/issues/802,224104390,MDEyOklzc3VlQ29tbWVudDIyNDEwNDM5MA==,4160723,2016-06-06T22:11:09Z,2016-06-06T22:11:09Z,MEMBER,"That behavior looks good enough to me too! It is indeed already quite flexible :-).
Do you think that we really need to handle `data.loc['a', 0]` if `data` is 2 or higher dimensional? We would also have to handle the case `data.loc[{'one': 'a', 'two': 0}]`...
Having to explicitly write `data.loc[('a', 0), :]` or `data.loc[('a', 0), ...]` seems fine to me (at least for now).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224093911,https://api.github.com/repos/pydata/xarray/issues/802,224093911,MDEyOklzc3VlQ29tbWVudDIyNDA5MzkxMQ==,1217238,2016-06-06T21:27:59Z,2016-06-06T21:27:59Z,MEMBER,"Yeah, that behavior looks pretty good to me -- there are limits on how flexible you can make things :).
We are fortunate that we have `.sel`, which lets us side-step some tricky ambiguous cases. Still, how do we handle `data.loc['a', 0]` if `data` is 2-dimensional? Pandas uses some terrible fall-back logic to allow the `0` to index either the second level or the second dimension, depending upon which works.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224092354,https://api.github.com/repos/pydata/xarray/issues/802,224092354,MDEyOklzc3VlQ29tbWVudDIyNDA5MjM1NA==,4160723,2016-06-06T21:21:29Z,2016-06-06T21:21:29Z,MEMBER,"```
>>> index = pd.MultiIndex.from_product((list('abc'), range(3)))
>>> index.names = ('one', 'two')
>>> data = xr.DataArray(range(9), [('x', index)])
>>> data
array([0, 1, 2, 3, 4, 5, 6, 7, 8])
Coordinates:
* x (x) object ('a', 0) ('a', 1) ('a', 2) ('b', 0) ('b', 1) ...
```
Example using a tuple of scalar/tuple/list/slice labels for each level:
```
>>> data.sel(x=('a', (0, 1)))
array([0, 1])
Coordinates:
* x (x) object ('a', 0) ('a', 1)
```
Example using a list of several combinations of multi-index labels:
```
>>> data.sel(x=[('a', 0), ('b', 1), ('c', 1)])
array([0, 4, 7])
Coordinates:
* x (x) object ('a', 0) ('b', 1) ('c', 1)
```
All the following examples raise an error:
```
>>> data.sel(x=(('a', 0), ('b', 1), ('c', 1)))
>>> data.sel(x=['a', (0, 1)])
>>> data.sel(x=[['a', 0], ['b', 1], ['c', 1]])
```
Note that after verification, pandas has the same behavior for all of the examples shown here.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-224047760,https://api.github.com/repos/pydata/xarray/issues/802,224047760,MDEyOklzc3VlQ29tbWVudDIyNDA0Nzc2MA==,1217238,2016-06-06T18:35:27Z,2016-06-06T18:35:27Z,MEMBER,"> A possible remaining issue is that using a tuple of tuples, lists or slices exactly behaves like pandas (it calls get_locs) while a nested list can still be used to select couples of multi-index labels (it calls get_indexer). Should we allow only one of two behaviors, or keep both (and mention it in the docs)?
Can you give a quick example of what using the nested list look like?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-223921363,https://api.github.com/repos/pydata/xarray/issues/802,223921363,MDEyOklzc3VlQ29tbWVudDIyMzkyMTM2Mw==,4160723,2016-06-06T10:22:49Z,2016-06-06T10:22:49Z,MEMBER,"Thanks for the review @shoyer !
I made some updates according to your comments.
A possible remaining issue is that using a tuple of tuples, lists or slices exactly behaves like pandas (it calls `get_locs`) while a nested list can still be used to select couples of multi-index labels (it calls `get_indexer`). Should we allow only one of two behaviors, or keep both (and mention it in the docs)?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-222544618,https://api.github.com/repos/pydata/xarray/issues/802,222544618,MDEyOklzc3VlQ29tbWVudDIyMjU0NDYxOA==,4160723,2016-05-30T19:30:03Z,2016-05-30T19:30:03Z,MEMBER,"I finally managed to add some tests and docs.
Two more comments:
- Index replacement: if the multi-index has unnamed levels, the chosen name for the replaced index (if any) is `_unnamed_level` (`old_dim` is the name of the replaced multi-index dimension).
Not sure whether or not it is really necessary to track the index level number. Single indexes returned by `get_loc_level` don't keep that information. Anyway, it is always better to set explicit names for the index levels.
- Indexing on multi-index now also works with nested tuples (#850).
@shoyer I think that it is ready for review. I successfully run the tests on my local branch. Currently, CI tests seem broken for some reason I don't know.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-206445614,https://api.github.com/repos/pydata/xarray/issues/802,206445614,MDEyOklzc3VlQ29tbWVudDIwNjQ0NTYxNA==,4160723,2016-04-06T16:12:44Z,2016-04-06T16:12:44Z,MEMBER,"I hope to have some time next week to work again on this PR.
@tippetts You can see in #719 a few comments about saving/reading xarray data objects with multi-index to/from netCDF. I also looking forward to see this feature implemented - actually I need it for another project that uses xarray - so maybe I'll find some time in the next couple of weeks to start a new PR on this.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-206410319,https://api.github.com/repos/pydata/xarray/issues/802,206410319,MDEyOklzc3VlQ29tbWVudDIwNjQxMDMxOQ==,17055041,2016-04-06T14:45:17Z,2016-04-06T14:45:17Z,NONE,"This will be a great feature. I for one am really looking forward to using it.
Will this work also allow saving to/reading from hdf5 and netcdf files with a MultiIndex? If not, can you give a sketch outline of the approach you (Stephan or Benoit) would take? I assume it would involve saving the information about the MultiIndex structure in some transformed way that fits into an hdf5 file, then reconstructing it on the read. I might need to hack together something for that before MultiIndex serialization makes it into xarray, but I'd like to make sure I don't veer too far off from the real solution that will ultimately come out.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201389147,https://api.github.com/repos/pydata/xarray/issues/802,201389147,MDEyOklzc3VlQ29tbWVudDIwMTM4OTE0Nw==,4160723,2016-03-25T17:56:51Z,2016-03-25T17:56:51Z,MEMBER,"Unless you see any other issues, I think that this feature doesn't need more development for now.
I'll be back next week to finish this PR (write some tests and doc).
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201378550,https://api.github.com/repos/pydata/xarray/issues/802,201378550,MDEyOklzc3VlQ29tbWVudDIwMTM3ODU1MA==,1217238,2016-03-25T17:30:33Z,2016-03-25T17:30:33Z,MEMBER,"Dictionaries not hashable, so we might be able to detect this case by
calling hash() and raising a KeyError with alternative suggestions. Just an
idea -- not sure if there's a clean way to implement that.
On Fri, Mar 25, 2016 at 10:10 AM, Benoit Bovy notifications@github.com
wrote:
> After refactoring a bit, da.loc[{'band': 'foo'}, :] and da.loc[{'band':
> 'foo'}, ...] now work as expected.
>
> da.loc[{'band': 'foo'}] still raises a KeyError. Although it is possible
> to catch this exception, it still remains ambiguous so it is probably
> better to do nothing.
>
> —
> You are receiving this because you commented.
> Reply to this email directly or view it on GitHub
> https://github.com/pydata/xarray/pull/802#issuecomment-201370448
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201370448,https://api.github.com/repos/pydata/xarray/issues/802,201370448,MDEyOklzc3VlQ29tbWVudDIwMTM3MDQ0OA==,4160723,2016-03-25T17:10:40Z,2016-03-25T17:10:40Z,MEMBER,"After refactoring a bit, `da.loc[{'band': 'foo'}, :]` and `da.loc[{'band': 'foo'}, ...]` now work as expected.
`da.loc[{'band': 'foo'}]` still raises a `KeyError`. Although it is possible to catch this exception, it still remains ambiguous so it is probably better to do nothing.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201340940,https://api.github.com/repos/pydata/xarray/issues/802,201340940,MDEyOklzc3VlQ29tbWVudDIwMTM0MDk0MA==,1217238,2016-03-25T15:51:02Z,2016-03-25T15:51:02Z,MEMBER,"> Multi-index level drop only works when a dict is provided for a dimension, i.e., da.sel(band_wavenumber='foo') still returns the full pandas.MultiIndex. This would require for indexing.convert_label_indexer to return an updated index also when label is not dict-like, which is not that straightforward I think.
Indeed. This would require an another data structure somewhere keeping track of level names -- and ideally also ensuring that they are always unique (like dimensions). This seems fine to me for now.
> There is a potential conflict when providing a dictionary to DataArray.loc, it may be interpreted either as a mapping of dimensions / labels or a mapping of index levels / labels for the 1st dimension. For now, the second option is not handled, e.g., da.loc[{'band': 'foo'}] raises a KeyError.
I agree -- better to require the user to be explicit. I also don't see many use cases for specifying the coordinate value and level name but not the dimension name. What happens if you type `da.loc[{'band': 'foo'}, :]` or `da.loc[{'band': 'foo'}, ...]`? In principle these are not ambiguous, but I could only guess at what happens without looking at the code.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201302454,https://api.github.com/repos/pydata/xarray/issues/802,201302454,MDEyOklzc3VlQ29tbWVudDIwMTMwMjQ1NA==,4160723,2016-03-25T14:17:29Z,2016-03-25T14:17:29Z,MEMBER,"I followed your suggestions.
Two more comments (not critical issues I think) :
- Multi-index level drop only works when a dict is provided for a dimension, i.e., `da.sel(band_wavenumber='foo')` still returns the full `pandas.MultiIndex`. This would require for `indexing.convert_label_indexer` to return an updated index also when `label` is not dict-like, which is not that straightforward I think.
- There is a potential conflict when providing a dictionary to `DataArray.loc`, it may be interpreted either as a mapping of dimensions / labels or a mapping of index levels / labels for the 1st dimension. For now, the second option is not handled, e.g., `da.loc[{'band': 'foo'}]` raises a `KeyError`.
In summary, `da.loc[{'band_wavenumber': {'band': 'foo'}}]` returns a DataArray with updated index and coordinate / dimension name, `da.loc['foo']` returns a DataArray with the full multi-index, and `da.loc[{'band': 'foo'}]` raises a `KeyError`. Three different results for the same operation, although I think that the first one is more explicit and better.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649
https://github.com/pydata/xarray/pull/802#issuecomment-201125233,https://api.github.com/repos/pydata/xarray/issues/802,201125233,MDEyOklzc3VlQ29tbWVudDIwMTEyNTIzMw==,1217238,2016-03-25T03:53:53Z,2016-03-25T03:53:53Z,MEMBER,"This looks very nice! I would probably opt for making `drop_level=True` the default for `.loc` -- it's hard to see when that wouldn't be desirable. In fact, I'm not even sure that it's worth having the option to set `drop_level=False`.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,143264649