home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 420445485

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/2410#issuecomment-420445485 https://api.github.com/repos/pydata/xarray/issues/2410 420445485 MDEyOklzc3VlQ29tbWVudDQyMDQ0NTQ4NQ== 1217238 2018-09-11T22:19:41Z 2018-09-11T22:19:41Z MEMBER

Copying @horta's doc:


Xarray definition

A data array a has d dimensions, ordered from 0 to d. It contains an array of dimensionality d. The first dimension of that array is associated with the first dimension of the data array, and so forth. That array is returned by the data array attribute values . A named data array is a data array with the name attribute of string value.

Each data array dimension has an unique name attribute of string type and can be accessed via data array dims attribute of tuple type. The name of the dimension i is a.dims[i] .

A data array can have zero or more coordinates, represented by a dict-like coords attribute. A coordinate is a named data array, referred also as coordinate data array. Coordinate data arrays have unique names among other coordinate data arrays. A coordinate data array of name x can be retrieved by a.coords[x] .

A coordinate can have zero or more dimensions associated with. A dimension data array is a unidimensional coordinate data array associated with one, and only one, dimension having the same name as the coordinate data array itself. A dimension data array has always one, and only one, coordinate. That coordinate has again a dimension data array associated with.

As an example, let a.coords[name] be a dimension data array of a, associated with dimension i. This means, among other things, that * a.shape[i] == a.coords[name].size * a.coords[name].name == name * len(a.coords[name].dims) == 1 * a.coords[name].dims[0] == name

Now lets consider a practical example: ```python

import numpy as np import xarray as xr

a = xr.DataArray(np.arange(6).reshape((3, 2)), dims=["x", "y"], coords={"x": list("abc")}) a <xarray.DataArray (x: 3, y: 2)> array([[0, 1], [2, 3], [4, 5]]) Coordinates: * x (x) <U1 'a' 'b' 'c' Dimensions without coordinates: y ```

Data array a has two dimensions: "x" and "y". It has a single coordinate "x", with its associated dimension data array a.coords["x"]. The dimension data array definition implies in the following recursion:

```python

a.coords["x"] <xarray.DataArray 'x' (x: 3)> array(['a', 'b', 'c'], dtype='<U1') Coordinates: * x (x) <U1 'a' 'b' 'c' a.coords["x"].coords["x"] <xarray.DataArray 'x' (x: 3)> array(['a', 'b', 'c'], dtype='<U1') Coordinates: * x (x) <U1 'a' 'b' 'c' ```

Indexing

Indexing is a operation that when applied to a data array will produce a new data array according the rules explained here.

The most general form of indexing is the one that involves data arrays i0, i1, ... in the operation a.sel(d0=i0, d1=i1, ...), for which d0, d1, ... are an exhaustive list of dimension names of a. i0 will select the indices from dimension d0 that match with its indices. And the same happens with each other dimension, independently. The matching works like an SQL JOIN:

  1. Cartesian product between the indices of i0 are the values of d0 after flattening the associated array.
  2. The resulting tuples that have the same elements are the resulting indices.

The following code snippet shows an indexing example of a data array a with two indexers i0 and i1.

```python

a = xr.DataArray(np.arange(6).reshape((3, 2)), dims=["x", "y"], coords={"x": list("abc"), "y": [0, 1]}) i0 = xr.DataArray(['a', 'c'], dims=["x"]) i0 <xarray.DataArray (x: 2)> array(['a', 'c'], dtype='<U1') Dimensions without coordinates: x i1 = xr.DataArray([0, 1], dims=["y"]) i1 <xarray.DataArray (y: 2)> array([0, 1]) Dimensions without coordinates: y a.sel(x=i0, y=i1) <xarray.DataArray (x: 2, y: 2)> array([[0, 1], [4, 5]]) Coordinates: * x (x) <U1 'a' 'c' * y (y) int64 0 1 ```

As hinted above, xarray allows the use of multidimensional data array indexers for greater flexibility. Lets look at another exampe:

```python

a = xr.DataArray([0, 1, 2], dims=["x"], coords={"x": list("abc")}) a <xarray.DataArray (x: 3)> array([0, 1, 2]) Coordinates: * x (x) <U1 'a' 'b' 'c' i0 = xr.DataArray([['a', 'c']], dims=["x0", "x1"]) i0 <xarray.DataArray (x0: 1, x1: 2)> array([['a', 'c']], dtype='<U1') Dimensions without coordinates: x0, x1 a.sel(x=i0) <xarray.DataArray (x0: 1, x1: 2)> array([[0, 2]]) Coordinates: x (x0, x1) object 'a' 'c' Dimensions without coordinates: x0, x1 ```

The resulting data array have the same dimensions as the indexer, not as the original data array. Notice also that the resulting data array has no dimension data array as opposed to the previous example. Instead, it has a bi-dimensional coordinate data array:

```python

a.sel(x=i0).coords["x"] <xarray.DataArray 'x' (x0: 1, x1: 2)> array([['a', 'c']], dtype=object) Coordinates: x (x0, x1) object 'a' 'c' Dimensions without coordinates: x0, x1 ```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  359240638
Powered by Datasette · Queries took 0.86ms · About: xarray-datasette