pull_requests: 65407870
This data as json
id | node_id | number | state | locked | title | user | body | created_at | updated_at | closed_at | merged_at | merge_commit_sha | assignee | milestone | draft | head | base | author_association | auto_merge | repo | url | merged_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
65407870 | MDExOlB1bGxSZXF1ZXN0NjU0MDc4NzA= | 818 | closed | 0 | Multidimensional groupby | 1197350 | Many datasets have a two dimensional coordinate variable (e.g. longitude) which is different from the logical grid coordinates (e.g. nx, ny). (See #605.) For plotting purposes, this is solved by #608. However, we still might want to split / apply / combine over such coordinates. That has not been possible, because groupby only supports creating groups on one-dimensional arrays. This PR overcomes that issue by using `stack` to collapse multiple dimensions in the group variable. A minimal example of the new functionality is ``` python >>> da = xr.DataArray([[0,1],[2,3]], coords={'lon': (['ny','nx'], [[30,40],[40,50]] ), 'lat': (['ny','nx'], [[10,10],[20,20]] )}, dims=['ny','nx']) >>> da.groupby('lon').sum() <xarray.DataArray (lon: 3)> array([0, 3, 3]) Coordinates: * lon (lon) int64 30 40 50 ``` This feature could have broad applicability for many realistic datasets (particularly model output on irregular grids): for example, averaging non-rectangular grids zonally (i.e. in latitude), binning in temperature, etc. If you think this is worth pursuing, I would love some feedback. The PR is not complete. Some items to address are - [x] Create a specialized grouper to allow coarser bins. By default, if no `grouper` is specified, the `GroupBy` object uses all unique values to define the groups. With a high resolution dataset, this could balloon to a huge number of groups. With the latitude example, we would like to be able to specify e.g. 1-degree bins. Usage would be `da.groupby('lon', bins=range(-90,90))`. - [ ] Allow specification of which dims to stack. For example, stack in space but keep time dimension intact. (Currently it just stacks all the dimensions of the group variable.) - [x] A nice example for the docs. | 2016-04-06T04:14:37Z | 2016-07-31T23:02:59Z | 2016-07-08T01:50:38Z | 2016-07-08T01:50:38Z | a0a3860a87815f1f580aa56b972c7e8d9359b6ce | 0 | dc50064728cceade436c65b958f1b06a60e2eec7 | 0d0ae9d3e766c3af9dd98383ab3b33dfea9494dc | MEMBER | 13221727 | https://github.com/pydata/xarray/pull/818 |
Links from other tables
- 0 rows from pull_requests_id in labels_pull_requests