home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 802101178

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/5054#issuecomment-802101178 https://api.github.com/repos/pydata/xarray/issues/5054 802101178 MDEyOklzc3VlQ29tbWVudDgwMjEwMTE3OA== 703554 2021-03-18T16:45:51Z 2021-03-18T16:58:44Z CONTRIBUTOR

FWIW my use case actually only needs indexing a single dimension, i.e., something equivalent to the numpy (or dask.array) compress function. This can be hacked for xarray datasets in a fairly straightforward way:

```python def _compress_dataarray(a, indexer, dim): data = a.data try: axis = a.dims.index(dim) except ValueError: v = data else: # rely on array_function to handle dispatching to dask if # data is a dask array v = np.compress(indexer, a.data, axis=axis) if hasattr(v, 'compute_chunk_sizes'): # needed to know dim lengths v.compute_chunk_sizes() return v

def compress_dataset(ds, indexer, dim): if isinstance(indexer, str): indexer = ds[indexer].data

coords = dict()
for k in ds.coords:
    a = ds[k]
    v = _compress_dataarray(a, indexer, dim)
    coords[k] = (a.dims, v)

data_vars = dict()
for k in ds.data_vars:
    a = ds[k]
    v = _compress_dataarray(a, indexer, dim)
    data_vars[k] = (a.dims, v)

attrs = ds.attrs.copy()

return xr.Dataset(data_vars=data_vars, coords=coords, attrs=attrs)

```

Given the complexity of fancy indexing in general, I wonder if it's worth contemplating implementing a Dataset.compress() method as a first step.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  834972299
Powered by Datasette · Queries took 0.695ms · About: xarray-datasette