pull_requests: 80229493
This data as json
id | node_id | number | state | locked | title | user | body | created_at | updated_at | closed_at | merged_at | merge_commit_sha | assignee | milestone | draft | head | base | author_association | auto_merge | repo | url | merged_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
80229493 | MDExOlB1bGxSZXF1ZXN0ODAyMjk0OTM= | 947 | closed | 0 | Multi-index levels as coordinates | 4160723 | Implements 2, 4 and 5 in #719. Demo: ``` In [1]: import numpy as np In [2]: import pandas as pd In [3]: import xarray as xr In [4]: index = pd.MultiIndex.from_product((list('ab'), range(2)), ...: names= ('level_1', 'level_2')) In [5]: da = xr.DataArray(np.random.rand(4, 4), coords={'x': index}, ...: dims=('x', 'y'), name='test') In [6]: da Out[6]: <xarray.DataArray 'test' (x: 4, y: 4)> array([[ 0.15036153, 0.68974802, 0.40082234, 0.94451318], [ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.3313594 , 0.93857424, 0.73023367, 0.44069622], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3 In [7]: da['level_1'] Out[7]: <xarray.DataArray 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 In [8]: da.sel(x='a', level_2=1) Out[8]: <xarray.DataArray 'test' (y: 4)> array([ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ]) Coordinates: x object ('a', 1) * y (y) int64 0 1 2 3 In [9]: da.sel(level_2=1) Out[9]: <xarray.DataArray 'test' (level_1: 2, y: 4)> array([[ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (level_1) object 'a' 'b' * y (y) int64 0 1 2 3 ``` Some notes about the implementation: - I slightly modified `Coordinate` so that it allows setting different values for the names of the coordinate and its dimension. There is no breaking change. - I also added a `Coordinate.get_level_coords` method to get independent, single-index coordinates objects from a MultiIndex coordinate. Remaining issues: - `Coordinate.get_level_coords` calls `pandas.MultiIndex.get_level_values` for each level and is itself called each time when indexing and for repr. This can be very costly!! It would be nice to return some kind of lazy index object instead of computing the actual level values. - repr replace a MultiIndex coordinate by its level coordinates. That can be confusing in some cases (see below). Maybe we can set a different marker than `*` for level coordinates. ``` In [6]: [name for name in da.coords] Out[6]: ['x', 'y'] In [7]: da.coords.keys() Out[7]: KeysView(Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3) ``` - `DataArray.level_1` doesn't return another `DataArray` object: ``` In [10]: da.level_1 Out[10]: <xarray.Coordinate 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) ``` - Maybe we need to test the uniqueness of level names at `DataArray` or `Dataset` creation. Of course still needs proper tests and docs... | 2016-08-05T11:34:49Z | 2016-09-14T15:25:28Z | 2016-09-14T03:34:51Z | 2016-09-14T03:34:51Z | 41654ef5e9da8cd15f3b68f8384f8c45c7fc16e9 | 0 | a447767e8d611d945dc864910a427ef7e3f4db11 | 3ecfa66613aaefdea8beb15edbd392b9f9d815c6 | MEMBER | 13221727 | https://github.com/pydata/xarray/pull/947 |
Links from other tables
- 0 rows from pull_requests_id in labels_pull_requests