home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 992699334

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2511#issuecomment-992699334 https://api.github.com/repos/pydata/xarray/issues/2511 992699334 IC_kwDOAMm_X847K2PG 2448579 2021-12-13T17:21:20Z 2021-12-13T17:21:20Z MEMBER

IIUC this cannot work lazily in most cases if you have dimension coordinate variables. When xarray constructs the output after indexing, it will try to index those coordinate variables so that it can associate the right timestamp (for e.g) with the output.

The example from @ulijh should work though (it has no dimension coordinate or indexed variables)

python import xarray as xr import dask.array as da import numpy as np da = xr.DataArray(np.random.rand(3*4*5).reshape((3,4,5))).chunk(dict(dim_0=1)) idcs = da.argmax('dim_2') da[dict(dim_2=idcs)]

The example by @rafa-guedes (thanks for that one!) could be made to work I think.

``` python import numpy as np import dask.array as da import xarray as xr

darr = xr.DataArray(data=[0.2, 0.4, 0.6], coords={"z": range(3)}, dims=("z",)) good_indexer = xr.DataArray( data=np.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) bad_indexer = xr.DataArray( data=da.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") )

In [5]: darr
Out[5]: <xarray.DataArray (z: 3)> array([0.2, 0.4, 0.6]) Coordinates: * z (z) int64 0 1 2

In [6]: good_indexer
Out[6]: <xarray.DataArray (y: 4, x: 2)> array([[0, 1], [2, 2], [1, 2], [1, 0]]) Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1

In [7]: bad_indexer
Out[7]: <xarray.DataArray 'reshape-417766b2035dcb1227ddde8505297039' (y: 4, x: 2)> dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1

In [8]: darr[good_indexer]
Out[8]: <xarray.DataArray (y: 4, x: 2)> array([[0.2, 0.4], [0.6, 0.6], [0.4, 0.6], [0.4, 0.2]]) Coordinates: z (y, x) int64 0 1 2 2 1 2 1 0 * y (y) int64 0 1 2 3 * x (x) int64 0 1 ```

We can copy the dimension coordinates of the output (x,y) directly from the indexer. And the dimension coordinate on the input (z) should be a dask array in the output (since z is not a dimension coordinate in the output, this should be fine)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  374025325
Powered by Datasette · Queries took 0.722ms · About: xarray-datasette