home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 374025325 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • shoyer 2
  • dcherian 2

issue 1

  • Array indexing with dask arrays · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
992699334 https://github.com/pydata/xarray/issues/2511#issuecomment-992699334 https://api.github.com/repos/pydata/xarray/issues/2511 IC_kwDOAMm_X847K2PG dcherian 2448579 2021-12-13T17:21:20Z 2021-12-13T17:21:20Z MEMBER

IIUC this cannot work lazily in most cases if you have dimension coordinate variables. When xarray constructs the output after indexing, it will try to index those coordinate variables so that it can associate the right timestamp (for e.g) with the output.

The example from @ulijh should work though (it has no dimension coordinate or indexed variables)

python import xarray as xr import dask.array as da import numpy as np da = xr.DataArray(np.random.rand(3*4*5).reshape((3,4,5))).chunk(dict(dim_0=1)) idcs = da.argmax('dim_2') da[dict(dim_2=idcs)]

The example by @rafa-guedes (thanks for that one!) could be made to work I think.

``` python import numpy as np import dask.array as da import xarray as xr

darr = xr.DataArray(data=[0.2, 0.4, 0.6], coords={"z": range(3)}, dims=("z",)) good_indexer = xr.DataArray( data=np.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") ) bad_indexer = xr.DataArray( data=da.random.randint(0, 3, 8).reshape(4, 2).astype(int), coords={"y": range(4), "x": range(2)}, dims=("y", "x") )

In [5]: darr
Out[5]: <xarray.DataArray (z: 3)> array([0.2, 0.4, 0.6]) Coordinates: * z (z) int64 0 1 2

In [6]: good_indexer
Out[6]: <xarray.DataArray (y: 4, x: 2)> array([[0, 1], [2, 2], [1, 2], [1, 0]]) Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1

In [7]: bad_indexer
Out[7]: <xarray.DataArray 'reshape-417766b2035dcb1227ddde8505297039' (y: 4, x: 2)> dask.array<reshape, shape=(4, 2), dtype=int64, chunksize=(4, 2), chunktype=numpy.ndarray> Coordinates: * y (y) int64 0 1 2 3 * x (x) int64 0 1

In [8]: darr[good_indexer]
Out[8]: <xarray.DataArray (y: 4, x: 2)> array([[0.2, 0.4], [0.6, 0.6], [0.4, 0.6], [0.4, 0.2]]) Coordinates: z (y, x) int64 0 1 2 2 1 2 1 0 * y (y) int64 0 1 2 3 * x (x) int64 0 1 ```

We can copy the dimension coordinates of the output (x,y) directly from the indexer. And the dimension coordinate on the input (z) should be a dask array in the output (since z is not a dimension coordinate in the output, this should be fine)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array indexing with dask arrays 374025325
568107398 https://github.com/pydata/xarray/issues/2511#issuecomment-568107398 https://api.github.com/repos/pydata/xarray/issues/2511 MDEyOklzc3VlQ29tbWVudDU2ODEwNzM5OA== dcherian 2448579 2019-12-20T22:14:34Z 2019-12-20T22:14:34Z MEMBER

I don't think any one is working on it. We would appreciate it if you could try to fix it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array indexing with dask arrays 374025325
523149751 https://github.com/pydata/xarray/issues/2511#issuecomment-523149751 https://api.github.com/repos/pydata/xarray/issues/2511 MDEyOklzc3VlQ29tbWVudDUyMzE0OTc1MQ== shoyer 1217238 2019-08-20T18:56:18Z 2019-08-20T18:56:18Z MEMBER

Yes, something seems to be going wrong here...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array indexing with dask arrays 374025325
433128556 https://github.com/pydata/xarray/issues/2511#issuecomment-433128556 https://api.github.com/repos/pydata/xarray/issues/2511 MDEyOklzc3VlQ29tbWVudDQzMzEyODU1Ng== shoyer 1217238 2018-10-25T16:59:28Z 2018-10-25T16:59:28Z MEMBER

For reference, here's the current stacktrace/error message: ```python-traceback


TypeError Traceback (most recent call last) <ipython-input-7-74fe4ba70f9d> in <module>() ----> 1 da[{'dim_1' : indc}]

/usr/local/lib/python3.6/dist-packages/xarray/core/dataarray.py in getitem(self, key) 472 else: 473 # xarray-style array indexing --> 474 return self.isel(indexers=self._item_key_to_dict(key)) 475 476 def setitem(self, key, value):

/usr/local/lib/python3.6/dist-packages/xarray/core/dataarray.py in isel(self, indexers, drop, **indexers_kwargs) 817 """ 818 indexers = either_dict_or_kwargs(indexers, indexers_kwargs, 'isel') --> 819 ds = self._to_temp_dataset().isel(drop=drop, indexers=indexers) 820 return self._from_temp_dataset(ds) 821

/usr/local/lib/python3.6/dist-packages/xarray/core/dataset.py in isel(self, indexers, drop, **indexers_kwargs) 1537 for name, var in iteritems(self._variables): 1538 var_indexers = {k: v for k, v in indexers_list if k in var.dims} -> 1539 new_var = var.isel(indexers=var_indexers) 1540 if not (drop and name in var_indexers): 1541 variables[name] = new_var

/usr/local/lib/python3.6/dist-packages/xarray/core/variable.py in isel(self, indexers, drop, **indexers_kwargs) 905 if dim in indexers: 906 key[i] = indexers[dim] --> 907 return self[tuple(key)] 908 909 def squeeze(self, dim=None):

/usr/local/lib/python3.6/dist-packages/xarray/core/variable.py in getitem(self, key) 614 array x.values directly. 615 """ --> 616 dims, indexer, new_order = self._broadcast_indexes(key) 617 data = as_indexable(self._data)[indexer] 618 if new_order:

/usr/local/lib/python3.6/dist-packages/xarray/core/variable.py in _broadcast_indexes(self, key) 487 return self._broadcast_indexes_outer(key) 488 --> 489 return self._broadcast_indexes_vectorized(key) 490 491 def _broadcast_indexes_basic(self, key):

/usr/local/lib/python3.6/dist-packages/xarray/core/variable.py in _broadcast_indexes_vectorized(self, key) 599 new_order = None 600 --> 601 return out_dims, VectorizedIndexer(tuple(out_key)), new_order 602 603 def getitem(self, key):

/usr/local/lib/python3.6/dist-packages/xarray/core/indexing.py in init(self, key) 423 else: 424 raise TypeError('unexpected indexer type for {}: {!r}' --> 425 .format(type(self).name, k)) 426 new_key.append(k) 427

TypeError: unexpected indexer type for VectorizedIndexer: dask.array<xarray-\<this-array>, shape=(10,), dtype=int64, chunksize=(2,)> ```

It looks like we could support this relatively easily since dask.array supports indexing with dask arrays now. This would be a welcome enhancement!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Array indexing with dask arrays 374025325

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 2159.165ms · About: xarray-datasette