home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1875857414

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1875857414 I_kwDOAMm_X85vz1AG 8129 Sort the values of an nD array 39069044 open 0     11 2023-08-31T16:20:40Z 2023-09-01T15:37:34Z   CONTRIBUTOR      

Is your feature request related to a problem?

As far as I know, there is no straightforward API in xarray to do what np.sort or pandas.sort_values does. We have DataArray.sortby("x"), which will sort the array according to the coordinate itself. But if instead you want to sort the values of the array to be monotonic, you're on your own. There are probably a lot of ways we could do this, but I ended up with the couple line solution below after a little trial and error.

Describe the solution you'd like

Would there be interest in implementing a Dataset/DataArray.sort_values(dim="x") method?

Note: this 1D example is not really relevant, see the 2D version and more obvious implementation in comments below for what I really want.

python def sort_values(self, dim: str): sort_idx = self.argsort(axis=self.get_axis_num(dim)).drop_vars(dim) return self.isel({dim: sort_idx}).drop_vars(dim).assign_coords({dim: self[dim]})

The goal is to handle arrays that we want to monotize like so:

```python da = xr.DataArray([1, 3, 2, 4], coords={"x": [1, 2, 3, 4]}) da.sort_values("x")

<xarray.DataArray (x: 4)> array([1, 2, 3, 4]) Coordinates: * x (x) int64 1 2 3 4 ```

In addition to sortby which can deal with an array that is just unordered according to the coordinate:

```python da = xr.DataArray([1, 3, 2, 4], coords={"x": [1, 3, 2, 4]}) da.sortby("x")

<xarray.DataArray (x: 4)> array([1, 2, 3, 4]) Coordinates: * x (x) int64 1 2 3 4 ```

Describe alternatives you've considered

I don't know if argsort is dask-enabled (the docs just point to the numpy function). Is there a more intelligent way to implement this with apply_ufunc and something else? I assume chunking in the sort dimension would be problematic.

Additional context

Some past related threads on this topic: https://github.com/pydata/xarray/issues/3957 https://stackoverflow.com/questions/64518239/sorting-dataset-along-axis-with-dask

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8129/reactions",
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.634ms · About: xarray-datasette