html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/3957#issuecomment-868260575,https://api.github.com/repos/pydata/xarray/issues/3957,868260575,MDEyOklzc3VlQ29tbWVudDg2ODI2MDU3NQ==,40410085,2021-06-25T06:33:47Z,2021-06-25T07:45:20Z,NONE,"Hello @zxdawn and @JavierRuano: I am new to python. And I've been working on a different approach to this issue of ranking data on 3D array. I believe I am close to a solution. I am able to generate ranks on a 3D array, but I can't figure out how to map those ranks to reorder the data from lowest to highest using the produced indexes. Perhaps you might know what needs to happen next? Cheers, -Jonathan code: import xarray as xr import os import numpy as np from xarray import DataArray from dask.distributed import Client c = Client() I am trying to produce a simpler version: # coding: utf-8 # In[1]: import xarray as xr import os import numpy as np from xarray import DataArray # In[2]: from dask.distributed import Client c = Client() # In[19]: def calculate_rank(x): return x.rank(dim='time') # In[4]: lat = 2 # In[5]: lon = 3 # In[6]: time = 5 # In[7]: data = [[[ 29, 19, 8], [ 12, 7, 21]], [[ 3, 4, 2], [ 18, 10, 24]], [[6, 28, 14], [15, 16, 17]], [[9, 1, 20], [5, 27, 26]], [[11, 25, 23], [22, 13, 0]]] # In[8]: data_xr = xr.DataArray(data, dims=['time', 'lat', 'lon'], coords={'time': np.arange(time)}) # In[9]: data_xr.values # Groupby on all of the data # In[10]: stacked_object = data_xr.stack(gridcell=['lat','lon'])#.chunk({'gridcell':500}) # In[11]: stacked_object.load() # In[20]: TSA_Rank = stacked_object.groupby('gridcell').apply(calculate_rank).unstack() # In[21]: TSA_Rank # In[22]: TSA_Rank.values array([[[5., 3., 2.], [2., 1., 3.]], [[1., 2., 1.], [4., 2., 4.]], [[2., 5., 3.], [3., 4., 2.]], [[3., 1., 4.], [1., 5., 5.]], [[4., 4., 5.], [5., 3., 1.]]]) # In[23]: data_xr.values array([[[29, 19, 8], [12, 7, 21]], [[ 3, 4, 2], [18, 10, 24]], [[ 6, 28, 14], [15, 16, 17]], [[ 9, 1, 20], [ 5, 27, 26]], [[11, 25, 23], [22, 13, 0]]])","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-707744782,https://api.github.com/repos/pydata/xarray/issues/3957,707744782,MDEyOklzc3VlQ29tbWVudDcwNzc0NDc4Mg==,30388627,2020-10-13T13:38:34Z,2020-10-13T13:38:34Z,NONE,"@JavierRuano I find the simpler solution from a similar question in [stack overflow](https://stackoverflow.com/a/53386129). ``` sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0) ``` ## Complete example ``` import xarray as xr import numpy as np x = 4 y = 2 z = 4 data = np.arange(x*y*z).reshape(z, y, x) # 3d array with coords cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)}) # 2d array without coords cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x']) # expand 2d to 3d cld_2 = cld_2.expand_dims(z=[4]) # concat cld = xr.concat([cld_1, cld_2], dim='z') # paired array pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x)) sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0) print(cld) print(pair) print(sort_pair) ``` **Output**: ``` array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]], [[ 8. , 9. , 10. , 11. ], [12. , 13. , 14. , 15. ]], [[16. , 17. , 18. , 19. ], [20. , 21. , 22. , 23. ]], [[24. , 25. , 26. , 27. ], [28. , 29. , 30. , 31. ]], [[ 1. , 2.5, 4. , 5.5], [ 7. , 8.5, 10. , 11.5]]]) Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7]], [[ 8, 9, 10, 11], [12, 13, 14, 15]], [[16, 17, 18, 19], [20, 21, 22, 23]], [[24, 25, 26, 27], [28, 29, 30, 31]], [[32, 33, 34, 35], [36, 37, 38, 39]]]) Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x [[[ 0 1 2 3] [ 4 5 6 7]] [[32 33 34 35] [36 37 38 39]] [[ 8 9 10 11] [12 13 14 15]] [[16 17 18 19] [20 21 22 23]] [[24 25 26 27] ``` Note, I have to use `pair.values` instead of `pair` in the last sorting step. Otherwise, I will get this error: ``` IndexError: Unlabeled multi-dimensional array cannot be used for indexing: y ```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611483929,https://api.github.com/repos/pydata/xarray/issues/3957,611483929,MDEyOklzc3VlQ29tbWVudDYxMTQ4MzkyOQ==,30388627,2020-04-09T11:43:51Z,2020-04-09T11:43:51Z,NONE,I need to use `df.index = pd.MultiIndex.from_arrays(.....)`. See https://github.com/pandas-dev/pandas/issues/33420,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611348892,https://api.github.com/repos/pydata/xarray/issues/3957,611348892,MDEyOklzc3VlQ29tbWVudDYxMTM0ODg5Mg==,30388627,2020-04-09T06:13:07Z,2020-04-09T06:13:07Z,NONE,"@JavierRuano When the `dataframe` is converted back to `dataset`, the values aren't changed because of the unchanged `Multiindex` in `dataframe` ... I have tried this: ``` df = ds.to_dataframe() new_df = df.sort_values(by=['x', 'y', 'cld']) new_df.index.set_levels(list(np.arange(ds['cld'].sizes['z'])), level='z', inplace=True) ``` But, it doesn't work. Still trying ...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611302006,https://api.github.com/repos/pydata/xarray/issues/3957,611302006,MDEyOklzc3VlQ29tbWVudDYxMTMwMjAwNg==,34353851,2020-04-09T03:04:53Z,2020-04-09T03:04:53Z,NONE,"Yes, but with a lot of information, dask is the only option, and working well with the index. https://github.com/dask/dask/issues/958 El jue., 9 abr. 2020 a las 2:54, Xin Zhang () escribió: > @JavierRuano Nice suggestion! I combine > them to dataset, convert it to dataframe and then sort_values. Finally, > convert the dataframe back to dataset: > > ds = cld.to_dataset(name='cld') > ds['pair'] = pair > > df = ds.to_dataframe() > new_ds = df.sort_values(by='cld').to_xarray().transpose() > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611299453,https://api.github.com/repos/pydata/xarray/issues/3957,611299453,MDEyOklzc3VlQ29tbWVudDYxMTI5OTQ1Mw==,30388627,2020-04-09T02:54:30Z,2020-04-09T02:54:30Z,NONE,"@JavierRuano Nice suggestion! I combine them to `dataset`, convert it to `dataframe` and then `sort_values`. Finally, convert the `dataframe` back to `dataset`: ``` ds = cld.to_dataset(name='cld') ds['pair'] = pair df = ds.to_dataframe() new_ds = df.sort_values(by='cld').to_xarray().transpose() ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611295039,https://api.github.com/repos/pydata/xarray/issues/3957,611295039,MDEyOklzc3VlQ29tbWVudDYxMTI5NTAzOQ==,34353851,2020-04-09T02:36:56Z,2020-04-09T02:36:56Z,NONE,"You could access directly to data as ndarray and you could transform dataarray into a dataframe of pandas. Pandas has sort_values. You searched sorting values according z, it is shown in z index. With more dataArray you could read about Dataset concept... but i dont develop xarray, i am only user of that module, perhaps you search another type of answer. http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sortby.html according to values of 1-D dataarrays that share dimension with calling object. El jue., 9 abr. 2020 4:22, Xin Zhang escribió: > @JavierRuano Thank you very much. This > example is a special case. If the order of z is different for each x and y, > do we need to create a tmp DataArray to save the result of looping x and y > ? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611291129,https://api.github.com/repos/pydata/xarray/issues/3957,611291129,MDEyOklzc3VlQ29tbWVudDYxMTI5MTEyOQ==,30388627,2020-04-09T02:22:33Z,2020-04-09T02:22:33Z,NONE,"@JavierRuano Thank you very much. This example is a special case. If the order of `z` is different for each `x` and `y`, do we need to create a tmp DataArray to save the result of looping `x` and `y`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599 https://github.com/pydata/xarray/issues/3957#issuecomment-611047964,https://api.github.com/repos/pydata/xarray/issues/3957,611047964,MDEyOklzc3VlQ29tbWVudDYxMTA0Nzk2NA==,34353851,2020-04-08T16:08:00Z,2020-04-08T16:08:00Z,NONE,"cld.reindex(z=cld[:,0,0].sortby(cld[:,0,0]).z) with this solution [0] [1] array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]], [[ 1. , 2.5, 4. , 5.5], [ 7. , 8.5, 10. , 11.5]], [[ 8. , 9. , 10. , 11. ], [12. , 13. , 14. , 15. ]], [[16. , 17. , 18. , 19. ], [20. , 21. , 22. , 23. ]], [[24. , 25. , 26. , 27. ], [28. , 29. , 30. , 31. ]]]) Coordinates: * z (z) int64 0 4 1 2 3 Dimensions without coordinates: y, x [0] https://stackoverflow.com/questions/41077393/how-to-sort-the-index-of-a-xarray-dataset-dataarray [1] https://github.com/pydata/xarray/issues/967 El mié., 8 abr. 2020 a las 14:06, Xin Zhang () escribió: > .sortby() only supports sorting DataArray by coords values. I'm trying to > sort one DataArray (cld) by data values along one dim and sort another > DataArray (pair) by the same order. > MCVE Code Sample > > import xarray as xrimport numpy as np > > x = 4 > y = 2 > z = 4 > data = np.arange(x*y*z).reshape(z, y, x) > # 3d array with coords > cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)}) > # 2d array without coords > cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x']) > # expand 2d to 3d > cld_2 = cld_2.expand_dims(z=[4]) > # concat > cld = xr.concat([cld_1, cld_2], dim='z') > # paired array > pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x)) > print(cld)print(pair) > > Output > > > array([[[ 0. , 1. , 2. , 3. ], > [ 4. , 5. , 6. , 7. ]], > > [[ 8. , 9. , 10. , 11. ], > [12. , 13. , 14. , 15. ]], > > [[16. , 17. , 18. , 19. ], > [20. , 21. , 22. , 23. ]], > > [[24. , 25. , 26. , 27. ], > [28. , 29. , 30. , 31. ]], > > [[ 1. , 2.5, 4. , 5.5], > [ 7. , 8.5, 10. , 11.5]]]) > Coordinates: > * z (z) int64 0 1 2 3 4 > Dimensions without coordinates: y, x > > > array([[[ 0, 1, 2, 3], > [ 4, 5, 6, 7]], > > [[ 8, 9, 10, 11], > [12, 13, 14, 15]], > > [[16, 17, 18, 19], > [20, 21, 22, 23]], > > [[24, 25, 26, 27], > [28, 29, 30, 31]], > > [[32, 33, 34, 35], > [36, 37, 38, 39]]]) > Coordinates: > * z (z) int64 0 1 2 3 4 > Dimensions without coordinates: y, x > > Problem Description > > I've tried argsort(): cld.argsort(axis=0), but the result is wrong: > > > array([[[0, 0, 0, 0], > [0, 0, 0, 0]], > > [[4, 4, 4, 4], > [4, 4, 4, 4]], > > [[1, 1, 1, 1], > [1, 1, 1, 1]], > > [[2, 2, 2, 2], > [2, 2, 2, 2]], > > [[3, 3, 3, 3], > [3, 3, 3, 3]]], dtype=int64) > Coordinates: > * z (z) int64 0 1 2 3 4 > Dimensions without coordinates: y, x > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > , or unsubscribe > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599