issue_comments: 1047915016
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/4118#issuecomment-1047915016 | https://api.github.com/repos/pydata/xarray/issues/4118 | 1047915016 | IC_kwDOAMm_X84-deoI | 4441338 | 2022-02-22T15:30:00Z | 2022-02-22T15:38:52Z | NONE | Often I run a function over a dataset, with each call outputing a hierarchical data structure, containing fixed dimensions in the best cases and variable length in the worst.
For this, it would make more sense to be able to have dimensions ( with optional labels and coordinates ) assigned to nodes (and these would be inherited by any descendants). Leaf nodes would hold data.
On merge, dimensions could be bubbled up as long as length (and labels) matched.
Operations with dimensions would then go down to corresponding dimension level before applying the operator, i.e. Datagroup and Datatree are subcases of this general structure, which could be enforced via flags/checks. Option 1 is where the extremities of the tree are a node with two sets of child nodes, dimension labels and n-dimensional arrays. Option 2 is where the extremities of the tree are a node with a child node for a n-dimensional array A, and a sibling node for each dimension of A, containing the corresponding labels. I'm sure I'm missing some big issue with the mental model I have, for instance I haven't thought of transformations at all and about coordinates. But for clarity I tried to write it down below. The most general structure for a dataset I can think of is a directed graph. Each node A is a n-dimensional (sparse) array, where each dimension D points optionally to a one-dimensional node B with the same length. To get a hierarchical structure, we:
We can resolve D's target by (A) checking for a sibling in T with the same name, and then going up one level and goto (A). Multindexes ( multi-dimensional (sparse) labels ) generalize this model, but require tuple labels in T's edges i.e. : h/j/a[x,y,z] has a sybling h/j/(x,y)[x,y] , with z's labels being one level above, i.e. h/z[z] ( the notation a[b] means map of index b to value a ). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
628719058 |