home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 124700322

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
124700322 MDExOlB1bGxSZXF1ZXN0NTQ5NDUxNzE= 702 Basic multiIndex support and stack/unstack methods 1217238 closed 0     13 2016-01-04T05:48:49Z 2016-06-01T16:48:54Z 2016-01-18T00:11:11Z MEMBER   0 pydata/xarray/pulls/702

Fixes #164, #700

Example usage:

``` In [3]: df = pd.DataFrame({'foo': range(3), ...: 'x': ['a', 'b', 'b'], ...: 'y': [0, 0, 1]}) ...:

In [4]: s = df.set_index(['x', 'y'])['foo']

In [5]: arr = xray.DataArray(s, dims='z')

In [6]: arr Out[6]: <xray.DataArray 'foo' (z: 3)> array([0, 1, 2]) Coordinates: * z (z) object ('a', 0) ('b', 0) ('b', 1)

In [7]: arr.indexes['z'] Out[7]: MultiIndex(levels=[[u'a', u'b'], [0, 1]], labels=[[0, 1, 1], [0, 0, 1]], names=[u'x', u'y'])

In [8]: arr.unstack('z') Out[8]: <xray.DataArray 'foo' (x: 2, y: 2)> array([[ 0., nan], [ 1., 2.]]) Coordinates: * x (x) object 'a' 'b' * y (y) int64 0 1

In [9]: arr.unstack('z').stack(z=('x', 'y')) Out[9]: <xray.DataArray 'foo' (z: 4)> array([ 0., nan, 1., 2.]) Coordinates: * z (z) object ('a', 0) ('a', 1) ('b', 0) ('b', 1) ```

TODO (maybe not necessary yet, but eventually): - [x] Multi-index support working with .loc and .sel() - [x] Multi-dimensional stack/unstack - [ ] Serialization to NetCDF - [ ] Better repr, showing level names/dtypes? - [ ] Make levels accessible as coordinate variables (e.g., ds['time'] can pull out the 'time' level of a multi-index) - [ ] Make isel_points/sel_points return objects with a MultiIndex? (probably after the previous TODO, so we can preserve basic backwards compatibility) - [ ] Add set_index/reset_index/swaplevel to make it easier to create and manipulate multi-indexes

It would be nice to eventually build a full example showing how stack can be combined with lazy loading / dask to do out-of-core PCA on a large geophysical dataset (e.g., identify El Nino).

cc @MaximilianR @jreback @jhamman

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/702/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 pull

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 13 rows from issue in issue_comments
Powered by Datasette · Queries took 0.547ms · About: xarray-datasette