html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3957#issuecomment-868260575,https://api.github.com/repos/pydata/xarray/issues/3957,868260575,MDEyOklzc3VlQ29tbWVudDg2ODI2MDU3NQ==,40410085,2021-06-25T06:33:47Z,2021-06-25T07:45:20Z,NONE,"Hello @zxdawn and @JavierRuano:
I am new to python. And I've been working on a different approach to this issue of ranking data on 3D array. I believe I am close to a solution. I am able to generate ranks on a 3D array, but I can't figure out how to map those ranks to reorder the data from lowest to highest using the produced indexes. Perhaps you might know what needs to happen next?
Cheers,
-Jonathan
code:
import xarray as xr
import os
import numpy as np
from xarray import DataArray
from dask.distributed import Client
c = Client()
I am trying to produce a simpler version:
# coding: utf-8
# In[1]:
import xarray as xr
import os
import numpy as np
from xarray import DataArray
# In[2]:
from dask.distributed import Client
c = Client()
# In[19]:
def calculate_rank(x):
return x.rank(dim='time')
# In[4]:
lat = 2
# In[5]:
lon = 3
# In[6]:
time = 5
# In[7]:
data = [[[ 29, 19, 8],
[ 12, 7, 21]],
[[ 3, 4, 2],
[ 18, 10, 24]],
[[6, 28, 14],
[15, 16, 17]],
[[9, 1, 20],
[5, 27, 26]],
[[11, 25, 23],
[22, 13, 0]]]
# In[8]:
data_xr = xr.DataArray(data, dims=['time', 'lat', 'lon'], coords={'time': np.arange(time)})
# In[9]:
data_xr.values
# Groupby on all of the data
# In[10]:
stacked_object = data_xr.stack(gridcell=['lat','lon'])#.chunk({'gridcell':500})
# In[11]:
stacked_object.load()
# In[20]:
TSA_Rank = stacked_object.groupby('gridcell').apply(calculate_rank).unstack()
# In[21]:
TSA_Rank
# In[22]:
TSA_Rank.values
array([[[5., 3., 2.],
[2., 1., 3.]],
[[1., 2., 1.],
[4., 2., 4.]],
[[2., 5., 3.],
[3., 4., 2.]],
[[3., 1., 4.],
[1., 5., 5.]],
[[4., 4., 5.],
[5., 3., 1.]]])
# In[23]:
data_xr.values
array([[[29, 19, 8],
[12, 7, 21]],
[[ 3, 4, 2],
[18, 10, 24]],
[[ 6, 28, 14],
[15, 16, 17]],
[[ 9, 1, 20],
[ 5, 27, 26]],
[[11, 25, 23],
[22, 13, 0]]])","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-707744782,https://api.github.com/repos/pydata/xarray/issues/3957,707744782,MDEyOklzc3VlQ29tbWVudDcwNzc0NDc4Mg==,30388627,2020-10-13T13:38:34Z,2020-10-13T13:38:34Z,NONE,"@JavierRuano I find the simpler solution from a similar question in [stack overflow](https://stackoverflow.com/a/53386129).
```
sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)
```
## Complete example
```
import xarray as xr
import numpy as np
x = 4
y = 2
z = 4
data = np.arange(x*y*z).reshape(z, y, x)
# 3d array with coords
cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})
# 2d array without coords
cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x'])
# expand 2d to 3d
cld_2 = cld_2.expand_dims(z=[4])
# concat
cld = xr.concat([cld_1, cld_2], dim='z')
# paired array
pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x))
sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)
print(cld)
print(pair)
print(sort_pair)
```
**Output**:
```
array([[[ 0. , 1. , 2. , 3. ],
[ 4. , 5. , 6. , 7. ]],
[[ 8. , 9. , 10. , 11. ],
[12. , 13. , 14. , 15. ]],
[[16. , 17. , 18. , 19. ],
[20. , 21. , 22. , 23. ]],
[[24. , 25. , 26. , 27. ],
[28. , 29. , 30. , 31. ]],
[[ 1. , 2.5, 4. , 5.5],
[ 7. , 8.5, 10. , 11.5]]])
Coordinates:
* z (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]],
[[24, 25, 26, 27],
[28, 29, 30, 31]],
[[32, 33, 34, 35],
[36, 37, 38, 39]]])
Coordinates:
* z (z) int64 0 1 2 3 4
Dimensions without coordinates: y, x
[[[ 0 1 2 3]
[ 4 5 6 7]]
[[32 33 34 35]
[36 37 38 39]]
[[ 8 9 10 11]
[12 13 14 15]]
[[16 17 18 19]
[20 21 22 23]]
[[24 25 26 27]
```
Note, I have to use `pair.values` instead of `pair` in the last sorting step.
Otherwise, I will get this error:
```
IndexError: Unlabeled multi-dimensional array cannot be used for indexing: y
```","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611483929,https://api.github.com/repos/pydata/xarray/issues/3957,611483929,MDEyOklzc3VlQ29tbWVudDYxMTQ4MzkyOQ==,30388627,2020-04-09T11:43:51Z,2020-04-09T11:43:51Z,NONE,I need to use `df.index = pd.MultiIndex.from_arrays(.....)`. See https://github.com/pandas-dev/pandas/issues/33420,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611348892,https://api.github.com/repos/pydata/xarray/issues/3957,611348892,MDEyOklzc3VlQ29tbWVudDYxMTM0ODg5Mg==,30388627,2020-04-09T06:13:07Z,2020-04-09T06:13:07Z,NONE,"@JavierRuano When the `dataframe` is converted back to `dataset`, the values aren't changed because of the unchanged `Multiindex` in `dataframe` ... I have tried this:
```
df = ds.to_dataframe()
new_df = df.sort_values(by=['x', 'y', 'cld'])
new_df.index.set_levels(list(np.arange(ds['cld'].sizes['z'])),
level='z', inplace=True)
```
But, it doesn't work. Still trying ...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611302006,https://api.github.com/repos/pydata/xarray/issues/3957,611302006,MDEyOklzc3VlQ29tbWVudDYxMTMwMjAwNg==,34353851,2020-04-09T03:04:53Z,2020-04-09T03:04:53Z,NONE,"Yes, but with a lot of information, dask is the only option, and working
well with the index.
https://github.com/dask/dask/issues/958
El jue., 9 abr. 2020 a las 2:54, Xin Zhang ()
escribió:
> @JavierRuano Nice suggestion! I combine
> them to dataset, convert it to dataframe and then sort_values. Finally,
> convert the dataframe back to dataset:
>
> ds = cld.to_dataset(name='cld')
> ds['pair'] = pair
>
> df = ds.to_dataframe()
> new_ds = df.sort_values(by='cld').to_xarray().transpose()
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611299453,https://api.github.com/repos/pydata/xarray/issues/3957,611299453,MDEyOklzc3VlQ29tbWVudDYxMTI5OTQ1Mw==,30388627,2020-04-09T02:54:30Z,2020-04-09T02:54:30Z,NONE,"@JavierRuano Nice suggestion! I combine them to `dataset`, convert it to `dataframe` and then `sort_values`. Finally, convert the `dataframe` back to `dataset`:
```
ds = cld.to_dataset(name='cld')
ds['pair'] = pair
df = ds.to_dataframe()
new_ds = df.sort_values(by='cld').to_xarray().transpose()
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611295039,https://api.github.com/repos/pydata/xarray/issues/3957,611295039,MDEyOklzc3VlQ29tbWVudDYxMTI5NTAzOQ==,34353851,2020-04-09T02:36:56Z,2020-04-09T02:36:56Z,NONE,"You could access directly to data as ndarray and you could transform
dataarray into a dataframe of pandas. Pandas has sort_values.
You searched sorting values according z, it is shown in z index.
With more dataArray you could read about Dataset concept...
but i dont develop xarray, i am only user of that module, perhaps you
search another type of answer.
http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sortby.html
according to values of 1-D dataarrays that share dimension with calling
object.
El jue., 9 abr. 2020 4:22, Xin Zhang escribió:
> @JavierRuano Thank you very much. This
> example is a special case. If the order of z is different for each x and y,
> do we need to create a tmp DataArray to save the result of looping x and y
> ?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or
> unsubscribe
>
> .
>
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611291129,https://api.github.com/repos/pydata/xarray/issues/3957,611291129,MDEyOklzc3VlQ29tbWVudDYxMTI5MTEyOQ==,30388627,2020-04-09T02:22:33Z,2020-04-09T02:22:33Z,NONE,"@JavierRuano Thank you very much. This example is a special case. If the order of `z` is different for each `x` and `y`, do we need to create a tmp DataArray to save the result of looping `x` and `y`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599
https://github.com/pydata/xarray/issues/3957#issuecomment-611047964,https://api.github.com/repos/pydata/xarray/issues/3957,611047964,MDEyOklzc3VlQ29tbWVudDYxMTA0Nzk2NA==,34353851,2020-04-08T16:08:00Z,2020-04-08T16:08:00Z,NONE,"cld.reindex(z=cld[:,0,0].sortby(cld[:,0,0]).z)
with this solution [0] [1]
array([[[ 0. , 1. , 2. , 3. ],
[ 4. , 5. , 6. , 7. ]],
[[ 1. , 2.5, 4. , 5.5],
[ 7. , 8.5, 10. , 11.5]],
[[ 8. , 9. , 10. , 11. ],
[12. , 13. , 14. , 15. ]],
[[16. , 17. , 18. , 19. ],
[20. , 21. , 22. , 23. ]],
[[24. , 25. , 26. , 27. ],
[28. , 29. , 30. , 31. ]]])
Coordinates:
* z (z) int64 0 4 1 2 3
Dimensions without coordinates: y, x
[0] https://stackoverflow.com/questions/41077393/how-to-sort-the-index-of-a-xarray-dataset-dataarray
[1] https://github.com/pydata/xarray/issues/967
El mié., 8 abr. 2020 a las 14:06, Xin Zhang ()
escribió:
> .sortby() only supports sorting DataArray by coords values. I'm trying to
> sort one DataArray (cld) by data values along one dim and sort another
> DataArray (pair) by the same order.
> MCVE Code Sample
>
> import xarray as xrimport numpy as np
>
> x = 4
> y = 2
> z = 4
> data = np.arange(x*y*z).reshape(z, y, x)
> # 3d array with coords
> cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})
> # 2d array without coords
> cld_2 = xr.DataArray(np.arange(x*y).reshape(y, x)*1.5+1, dims=['y', 'x'])
> # expand 2d to 3d
> cld_2 = cld_2.expand_dims(z=[4])
> # concat
> cld = xr.concat([cld_1, cld_2], dim='z')
> # paired array
> pair = cld.copy(data=np.arange(x*y*(z+1)).reshape(z+1, y, x))
> print(cld)print(pair)
>
> Output
>
>
> array([[[ 0. , 1. , 2. , 3. ],
> [ 4. , 5. , 6. , 7. ]],
>
> [[ 8. , 9. , 10. , 11. ],
> [12. , 13. , 14. , 15. ]],
>
> [[16. , 17. , 18. , 19. ],
> [20. , 21. , 22. , 23. ]],
>
> [[24. , 25. , 26. , 27. ],
> [28. , 29. , 30. , 31. ]],
>
> [[ 1. , 2.5, 4. , 5.5],
> [ 7. , 8.5, 10. , 11.5]]])
> Coordinates:
> * z (z) int64 0 1 2 3 4
> Dimensions without coordinates: y, x
>
>
> array([[[ 0, 1, 2, 3],
> [ 4, 5, 6, 7]],
>
> [[ 8, 9, 10, 11],
> [12, 13, 14, 15]],
>
> [[16, 17, 18, 19],
> [20, 21, 22, 23]],
>
> [[24, 25, 26, 27],
> [28, 29, 30, 31]],
>
> [[32, 33, 34, 35],
> [36, 37, 38, 39]]])
> Coordinates:
> * z (z) int64 0 1 2 3 4
> Dimensions without coordinates: y, x
>
> Problem Description
>
> I've tried argsort(): cld.argsort(axis=0), but the result is wrong:
>
>
> array([[[0, 0, 0, 0],
> [0, 0, 0, 0]],
>
> [[4, 4, 4, 4],
> [4, 4, 4, 4]],
>
> [[1, 1, 1, 1],
> [1, 1, 1, 1]],
>
> [[2, 2, 2, 2],
> [2, 2, 2, 2]],
>
> [[3, 3, 3, 3],
> [3, 3, 3, 3]]], dtype=int64)
> Coordinates:
> * z (z) int64 0 1 2 3 4
> Dimensions without coordinates: y, x
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> , or unsubscribe
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,596606599