home / github / pull_requests

Menu
  • GraphQL API
  • Search all tables

pull_requests: 80229493

This data as json

id node_id number state locked title user body created_at updated_at closed_at merged_at merge_commit_sha assignee milestone draft head base author_association auto_merge repo url merged_by
80229493 MDExOlB1bGxSZXF1ZXN0ODAyMjk0OTM= 947 closed 0 Multi-index levels as coordinates 4160723 Implements 2, 4 and 5 in #719. Demo: ``` In [1]: import numpy as np In [2]: import pandas as pd In [3]: import xarray as xr In [4]: index = pd.MultiIndex.from_product((list('ab'), range(2)), ...: names= ('level_1', 'level_2')) In [5]: da = xr.DataArray(np.random.rand(4, 4), coords={'x': index}, ...: dims=('x', 'y'), name='test') In [6]: da Out[6]: <xarray.DataArray 'test' (x: 4, y: 4)> array([[ 0.15036153, 0.68974802, 0.40082234, 0.94451318], [ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.3313594 , 0.93857424, 0.73023367, 0.44069622], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3 In [7]: da['level_1'] Out[7]: <xarray.DataArray 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 In [8]: da.sel(x='a', level_2=1) Out[8]: <xarray.DataArray 'test' (y: 4)> array([ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ]) Coordinates: x object ('a', 1) * y (y) int64 0 1 2 3 In [9]: da.sel(level_2=1) Out[9]: <xarray.DataArray 'test' (level_1: 2, y: 4)> array([[ 0.26732938, 0.49598123, 0.8679231 , 0.6149102 ], [ 0.81304837, 0.81244159, 0.37274953, 0.86405196]]) Coordinates: * level_1 (level_1) object 'a' 'b' * y (y) int64 0 1 2 3 ``` Some notes about the implementation: - I slightly modified `Coordinate` so that it allows setting different values for the names of the coordinate and its dimension. There is no breaking change. - I also added a `Coordinate.get_level_coords` method to get independent, single-index coordinates objects from a MultiIndex coordinate. Remaining issues: - `Coordinate.get_level_coords` calls `pandas.MultiIndex.get_level_values` for each level and is itself called each time when indexing and for repr. This can be very costly!! It would be nice to return some kind of lazy index object instead of computing the actual level values. - repr replace a MultiIndex coordinate by its level coordinates. That can be confusing in some cases (see below). Maybe we can set a different marker than `*` for level coordinates. ``` In [6]: [name for name in da.coords] Out[6]: ['x', 'y'] In [7]: da.coords.keys() Out[7]: KeysView(Coordinates: * level_1 (x) object 'a' 'a' 'b' 'b' * level_2 (x) int64 0 1 0 1 * y (y) int64 0 1 2 3) ``` - `DataArray.level_1` doesn't return another `DataArray` object: ``` In [10]: da.level_1 Out[10]: <xarray.Coordinate 'level_1' (x: 4)> array(['a', 'a', 'b', 'b'], dtype=object) ``` - Maybe we need to test the uniqueness of level names at `DataArray` or `Dataset` creation. Of course still needs proper tests and docs... 2016-08-05T11:34:49Z 2016-09-14T15:25:28Z 2016-09-14T03:34:51Z 2016-09-14T03:34:51Z 41654ef5e9da8cd15f3b68f8384f8c45c7fc16e9     0 a447767e8d611d945dc864910a427ef7e3f4db11 3ecfa66613aaefdea8beb15edbd392b9f9d815c6 MEMBER   13221727 https://github.com/pydata/xarray/pull/947  

Links from other tables

  • 0 rows from pull_requests_id in labels_pull_requests
Powered by Datasette · Queries took 0.769ms