home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 707321343

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4285#issuecomment-707321343 https://api.github.com/repos/pydata/xarray/issues/4285 707321343 MDEyOklzc3VlQ29tbWVudDcwNzMyMTM0Mw== 1852447 2020-10-12T20:08:32Z 2020-10-12T20:08:32Z NONE

Copied from https://gitter.im/pangeo-data/Lobby :

I've been using Xarray with argopy recently, and the immediate value I see is the documentation of columns, which is semi-lacking in Awkward (one user has been passing this information through an Awkward tree as a scikit-hep/awkward-1.0#422). I should also look into Xarray's indexing, which I've always seen as being the primary difference between NumPy and Pandas; Awkward Array has no indexing, though every node has an optional Identities which would be used to track such information through Awkward manipulations—Identities would have a bijection with externally supplied indexes. They haven't been used for anything yet.

Although the elevator pitch for Xarray is "n-dimensional Pandas," it's rather different, isn't it? The contextual metadata is more extensive than anything I've seen in Pandas, and Xarray can be partitioned for out-of-core analysis: Xarray wraps Dask, unlike Dask's array collection, which wraps NumPy. I had troubles getting Pandas to wrap Awkward array (scikit-hep/awkward-1.0#350 ), but maybe these won't be issues for Xarray.

One last thing (in this very rambly message): the main difficulty I think we would have in that is that Awkward Arrays don't have shape and dtype, since those define a rectilinear array of numbers. The data model is Datashape plus union types. There is a sense in which ndim is defined: the number of nested lists before reaching the first record, which may split it into different depths for each field, but even this can be ill-defined with union types:

```python

import awkward1 as ak array = ak.Array([1, 2, [3, 4, 5], [[6, 7, 8]]]) array <Array [1, 2, [3, 4, 5], [[6, 7, 8]]] type='4 * union[int64, var * union[int64, ...'> array.type 4 * union[int64, var * union[int64, var * int64]] array.ndim -1 ```

So if we wanted to have an Xarray of Awkward Arrays, we'd have to take stock of all the assumptions Xarray makes about the arrays it contains.

{
    "total_count": 5,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 5,
    "rocket": 0,
    "eyes": 0
}
  667864088
Powered by Datasette · Queries took 0.885ms · About: xarray-datasette