home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "NONE" and issue = 596606599 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • zxdawn 5
  • JavierRuano 3
  • jrbuzan 1

issue 1

  • Sort DataArray by data values along one dim · 9 ✖

author_association 1

  • NONE · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
868260575 https://github.com/pydata/xarray/issues/3957#issuecomment-868260575 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDg2ODI2MDU3NQ== jrbuzan 40410085 2021-06-25T06:33:47Z 2021-06-25T07:45:20Z NONE

Hello @zxdawn and @JavierRuano:

I am new to python. And I've been working on a different approach to this issue of ranking data on 3D array. I believe I am close to a solution. I am able to generate ranks on a 3D array, but I can't figure out how to map those ranks to reorder the data from lowest to highest using the produced indexes. Perhaps you might know what needs to happen next?

Cheers, -Jonathan

code: import xarray as xr import os import numpy as np from xarray import DataArray from dask.distributed import Client c = Client()

I am trying to produce a simpler version:

coding: utf-8

In[1]:

import xarray as xr import os import numpy as np from xarray import DataArray

In[2]:

from dask.distributed import Client c = Client()

In[19]:

def calculate_rank(x): return x.rank(dim='time')

In[4]:

lat = 2

In[5]:

lon = 3

In[6]:

time = 5

In[7]:

data = [[[ 29, 19, 8], [ 12, 7, 21]],

   [[ 3,  4,  2],
    [ 18, 10, 24]],

   [[6, 28, 14],
    [15, 16, 17]],

   [[9, 1, 20],
    [5, 27, 26]],

   [[11, 25, 23],
    [22, 13, 0]]]

In[8]:

data_xr = xr.DataArray(data, dims=['time', 'lat', 'lon'], coords={'time': np.arange(time)})

In[9]:

data_xr.values

Groupby on all of the data

In[10]:

stacked_object = data_xr.stack(gridcell=['lat','lon'])#.chunk({'gridcell':500})

In[11]:

stacked_object.load()

In[20]:

TSA_Rank = stacked_object.groupby('gridcell').apply(calculate_rank).unstack()

In[21]:

TSA_Rank

In[22]:

TSA_Rank.values array([[[5., 3., 2.], [2., 1., 3.]],

   [[1., 2., 1.],
    [4., 2., 4.]],

   [[2., 5., 3.],
    [3., 4., 2.]],

   [[3., 1., 4.],
    [1., 5., 5.]],

   [[4., 4., 5.],
    [5., 3., 1.]]])

In[23]:

data_xr.values array([[[29, 19, 8], [12, 7, 21]],

   [[ 3,  4,  2],
    [18, 10, 24]],

   [[ 6, 28, 14],
    [15, 16, 17]],

   [[ 9,  1, 20],
    [ 5, 27, 26]],

   [[11, 25, 23],
    [22, 13,  0]]])
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
707744782 https://github.com/pydata/xarray/issues/3957#issuecomment-707744782 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDcwNzc0NDc4Mg== zxdawn 30388627 2020-10-13T13:38:34Z 2020-10-13T13:38:34Z NONE

@JavierRuano I find the simpler solution from a similar question in stack overflow.

sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)

Complete example

``` import xarray as xr import numpy as np

x = 4 y = 2 z = 4

data = np.arange(xyz).reshape(z, y, x)

3d array with coords

cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

2d array without coords

cld_2 = xr.DataArray(np.arange(xy).reshape(y, x)1.5+1, dims=['y', 'x'])

expand 2d to 3d

cld_2 = cld_2.expand_dims(z=[4])

concat

cld = xr.concat([cld_1, cld_2], dim='z')

paired array

pair = cld.copy(data=np.arange(xy(z+1)).reshape(z+1, y, x))

sort_pair = np.take_along_axis(pair.values, cld.argsort(axis=0), axis=0)

print(cld) print(pair) print(sort_pair) ```

Output: ``` <xarray.DataArray (z: 5, y: 2, x: 4)> array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]],

   [[ 8. ,  9. , 10. , 11. ],
    [12. , 13. , 14. , 15. ]],

   [[16. , 17. , 18. , 19. ],
    [20. , 21. , 22. , 23. ]],

   [[24. , 25. , 26. , 27. ],
    [28. , 29. , 30. , 31. ]],

   [[ 1. ,  2.5,  4. ,  5.5],
    [ 7. ,  8.5, 10. , 11.5]]])

Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x <xarray.DataArray (z: 5, y: 2, x: 4)> array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7]],

   [[ 8,  9, 10, 11],
    [12, 13, 14, 15]],

   [[16, 17, 18, 19],
    [20, 21, 22, 23]],

   [[24, 25, 26, 27],
    [28, 29, 30, 31]],

   [[32, 33, 34, 35],
    [36, 37, 38, 39]]])

Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x [[[ 0 1 2 3] [ 4 5 6 7]]

[[32 33 34 35] [36 37 38 39]]

[[ 8 9 10 11] [12 13 14 15]]

[[16 17 18 19] [20 21 22 23]]

[[24 25 26 27] ```

Note, I have to use pair.values instead of pair in the last sorting step. Otherwise, I will get this error:

IndexError: Unlabeled multi-dimensional array cannot be used for indexing: y

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611483929 https://github.com/pydata/xarray/issues/3957#issuecomment-611483929 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTQ4MzkyOQ== zxdawn 30388627 2020-04-09T11:43:51Z 2020-04-09T11:43:51Z NONE

I need to use df.index = pd.MultiIndex.from_arrays(.....). See https://github.com/pandas-dev/pandas/issues/33420

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611348892 https://github.com/pydata/xarray/issues/3957#issuecomment-611348892 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTM0ODg5Mg== zxdawn 30388627 2020-04-09T06:13:07Z 2020-04-09T06:13:07Z NONE

@JavierRuano When the dataframe is converted back to dataset, the values aren't changed because of the unchanged Multiindex in dataframe ... I have tried this: df = ds.to_dataframe() new_df = df.sort_values(by=['x', 'y', 'cld']) new_df.index.set_levels(list(np.arange(ds['cld'].sizes['z'])), level='z', inplace=True) But, it doesn't work. Still trying ...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611302006 https://github.com/pydata/xarray/issues/3957#issuecomment-611302006 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTMwMjAwNg== JavierRuano 34353851 2020-04-09T03:04:53Z 2020-04-09T03:04:53Z NONE

Yes, but with a lot of information, dask is the only option, and working well with the index. https://github.com/dask/dask/issues/958

El jue., 9 abr. 2020 a las 2:54, Xin Zhang (notifications@github.com) escribió:

@JavierRuano https://github.com/JavierRuano Nice suggestion! I combine them to dataset, convert it to dataframe and then sort_values. Finally, convert the dataframe back to dataset:

ds = cld.to_dataset(name='cld') ds['pair'] = pair

df = ds.to_dataframe() new_ds = df.sort_values(by='cld').to_xarray().transpose()

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3957#issuecomment-611299453, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIGDFO73HGCN7EOCCBW3NJDRLU2HHANCNFSM4MD6V32A .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611299453 https://github.com/pydata/xarray/issues/3957#issuecomment-611299453 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTI5OTQ1Mw== zxdawn 30388627 2020-04-09T02:54:30Z 2020-04-09T02:54:30Z NONE

@JavierRuano Nice suggestion! I combine them to dataset, convert it to dataframe and then sort_values. Finally, convert the dataframe back to dataset: ``` ds = cld.to_dataset(name='cld') ds['pair'] = pair

df = ds.to_dataframe() new_ds = df.sort_values(by='cld').to_xarray().transpose() ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611295039 https://github.com/pydata/xarray/issues/3957#issuecomment-611295039 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTI5NTAzOQ== JavierRuano 34353851 2020-04-09T02:36:56Z 2020-04-09T02:36:56Z NONE

You could access directly to data as ndarray and you could transform dataarray into a dataframe of pandas. Pandas has sort_values. You searched sorting values according z, it is shown in z index.

With more dataArray you could read about Dataset concept...

but i dont develop xarray, i am only user of that module, perhaps you search another type of answer.

http://xarray.pydata.org/en/stable/generated/xarray.Dataset.sortby.html according to values of 1-D dataarrays that share dimension with calling object.

El jue., 9 abr. 2020 4:22, Xin Zhang notifications@github.com escribió:

@JavierRuano https://github.com/JavierRuano Thank you very much. This example is a special case. If the order of z is different for each x and y, do we need to create a tmp DataArray to save the result of looping x and y ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3957#issuecomment-611291129, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIGDFO4TP6RSAK7CV3DJV73RLUWPRANCNFSM4MD6V32A .

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611291129 https://github.com/pydata/xarray/issues/3957#issuecomment-611291129 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTI5MTEyOQ== zxdawn 30388627 2020-04-09T02:22:33Z 2020-04-09T02:22:33Z NONE

@JavierRuano Thank you very much. This example is a special case. If the order of z is different for each x and y, do we need to create a tmp DataArray to save the result of looping x and y?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599
611047964 https://github.com/pydata/xarray/issues/3957#issuecomment-611047964 https://api.github.com/repos/pydata/xarray/issues/3957 MDEyOklzc3VlQ29tbWVudDYxMTA0Nzk2NA== JavierRuano 34353851 2020-04-08T16:08:00Z 2020-04-08T16:08:00Z NONE

cld.reindex(z=cld[:,0,0].sortby(cld[:,0,0]).z)

with this solution [0] [1]

<xarray.DataArray (z: 5, y: 2, x: 4)> array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]],

   [[ 1. ,  2.5,  4. ,  5.5],
    [ 7. ,  8.5, 10. , 11.5]],

   [[ 8. ,  9. , 10. , 11. ],
    [12. , 13. , 14. , 15. ]],

   [[16. , 17. , 18. , 19. ],
    [20. , 21. , 22. , 23. ]],

   [[24. , 25. , 26. , 27. ],
    [28. , 29. , 30. , 31. ]]])

Coordinates: * z (z) int64 0 4 1 2 3 Dimensions without coordinates: y, x

[0] https://stackoverflow.com/questions/41077393/how-to-sort-the-index-of-a-xarray-dataset-dataarray

[1] https://github.com/pydata/xarray/issues/967

El mié., 8 abr. 2020 a las 14:06, Xin Zhang (notifications@github.com) escribió:

.sortby() only supports sorting DataArray by coords values. I'm trying to sort one DataArray (cld) by data values along one dim and sort another DataArray (pair) by the same order. MCVE Code Sample

import xarray as xrimport numpy as np

x = 4 y = 2 z = 4 data = np.arange(xyz).reshape(z, y, x)

3d array with coords

cld_1 = xr.DataArray(data, dims=['z', 'y', 'x'], coords={'z': np.arange(z)})

2d array without coords

cld_2 = xr.DataArray(np.arange(xy).reshape(y, x)1.5+1, dims=['y', 'x'])

expand 2d to 3d

cld_2 = cld_2.expand_dims(z=[4])

concat

cld = xr.concat([cld_1, cld_2], dim='z')

paired array

pair = cld.copy(data=np.arange(xy(z+1)).reshape(z+1, y, x)) print(cld)print(pair)

Output

<xarray.DataArray (z: 5, y: 2, x: 4)> array([[[ 0. , 1. , 2. , 3. ], [ 4. , 5. , 6. , 7. ]],

   [[ 8. ,  9. , 10. , 11. ],
    [12. , 13. , 14. , 15. ]],

   [[16. , 17. , 18. , 19. ],
    [20. , 21. , 22. , 23. ]],

   [[24. , 25. , 26. , 27. ],
    [28. , 29. , 30. , 31. ]],

   [[ 1. ,  2.5,  4. ,  5.5],
    [ 7. ,  8.5, 10. , 11.5]]])

Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x

<xarray.DataArray (z: 5, y: 2, x: 4)> array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7]],

   [[ 8,  9, 10, 11],
    [12, 13, 14, 15]],

   [[16, 17, 18, 19],
    [20, 21, 22, 23]],

   [[24, 25, 26, 27],
    [28, 29, 30, 31]],

   [[32, 33, 34, 35],
    [36, 37, 38, 39]]])

Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x

Problem Description

I've tried argsort(): cld.argsort(axis=0), but the result is wrong:

<xarray.DataArray (z: 5, y: 2, x: 4)> array([[[0, 0, 0, 0], [0, 0, 0, 0]],

   [[4, 4, 4, 4],
    [4, 4, 4, 4]],

   [[1, 1, 1, 1],
    [1, 1, 1, 1]],

   [[2, 2, 2, 2],
    [2, 2, 2, 2]],

   [[3, 3, 3, 3],
    [3, 3, 3, 3]]], dtype=int64)

Coordinates: * z (z) int64 0 1 2 3 4 Dimensions without coordinates: y, x

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/3957, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIGDFOYZYE2UCDJR4AAHAJLRLSAEZANCNFSM4MD6V32A .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Sort DataArray by data values along one dim 596606599

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.002ms · About: xarray-datasette