issue_comments: 821902582
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/pull/5089#issuecomment-821902582 | https://api.github.com/repos/pydata/xarray/issues/5089 | 821902582 | MDEyOklzc3VlQ29tbWVudDgyMTkwMjU4Mg== | 5635139 | 2021-04-17T23:37:07Z | 2021-04-17T23:37:07Z | MEMBER | Hi @ahuang11 — forgive the delay. We discussed this with the team on our call and think it would be a welcome addition, so thank you for contributing. I took another look through the tests and the behavior looks ideal for dimensioned coords are passed: ```python In [6]: da Out[6]: <xarray.DataArray (lat: 5, lon: 5)> array([[ 0, 0, 0, 0, 0], [ 0, 1, 2, 3, 4], [ 0, 2, 4, 6, 8], [ 0, 3, 6, 9, 12], [ 0, 4, 8, 12, 16]]) Coordinates: * lat (lat) int64 0 1 2 2 3 * lon (lon) int64 0 1 3 3 4 In [7]: result = da.drop_duplicate_coords(["lat", "lon"], keep='first') In [8]: result Out[8]: <xarray.DataArray (lat: 4, lon: 4)> array([[ 0, 0, 0, 0], [ 0, 1, 2, 4], [ 0, 2, 4, 8], [ 0, 4, 8, 16]]) Coordinates: * lat (lat) int64 0 1 2 3 * lon (lon) int64 0 1 3 4 ``` And I think this is also the best we can do for non-dimensioned coords. One thing I call out is that: a. The array is stacked for any non-dim coord > 1 dim b. The supplied coord becomes the new dimensioned coord e.g. Stacking: ```python In [12]: da Out[12]: <xarray.DataArray (init: 2, tau: 3)> array([[1, 2, 3], [4, 5, 6]]) Coordinates: * init (init) int64 0 1 * tau (tau) int64 1 2 3 valid (init, tau) int64 8 6 6 7 7 7 In [13]: da.drop_duplicate_coords("valid") Out[13]: <xarray.DataArray (valid: 3)> array([1, 2, 4]) Coordinates: * valid (valid) int64 8 6 7 init (valid) int64 0 0 1 tau (valid) int64 1 2 1 ``` Changing the dimensions: ```python In [16]: ( ...: da ...: .assign_coords(dict(zeta=(('tau'),[4,4,6]))) ...: .drop_duplicate_coords('zeta') ...: ) Out[16]: <xarray.DataArray (init: 2, zeta: 2)> array([[1, 3], [4, 6]]) Coordinates: * init (init) int64 0 1 valid (init, zeta) int64 8 6 7 7 * zeta (zeta) int64 4 6 tau (zeta) int64 1 3 ``` One peculiarity — though I think a necessary one — is that the order matters in some cases: ```python In [17]: ( ...: da ...: .assign_coords(dict(zeta=(('tau'),[4,4,6]))) ...: .drop_duplicate_coords(['zeta','valid']) ...: ) Out[17]: <xarray.DataArray (valid: 3)> array([1, 3, 4]) Coordinates: * valid (valid) int64 8 6 7 tau (valid) int64 1 3 1 init (valid) int64 0 0 1 zeta (valid) int64 4 6 4 In [18]: ( ...: da ...: .assign_coords(dict(zeta=(('tau'),[4,4,6]))) ...: .drop_duplicate_coords(['valid','zeta']) ...: ) Out[18]: <xarray.DataArray (zeta: 1)> array([1]) Coordinates: * zeta (zeta) int64 4 init (zeta) int64 0 tau (zeta) int64 1 valid (zeta) int64 8 ``` Unless anyone has any more thoughts, let's plan to merge this over the next few days. Thanks again @ahuang11 ! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
842940980 |