issue_comments: 1047915016

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/4118#issuecomment-1047915016	https://api.github.com/repos/pydata/xarray/issues/4118	1047915016	IC_kwDOAMm_X84-deoI	4441338	2022-02-22T15:30:00Z	2022-02-22T15:38:52Z	NONE	Often I run a function over a dataset, with each call outputing a hierarchical data structure, containing fixed dimensions in the best cases and variable length in the worst. For this, it would make more sense to be able to have dimensions ( with optional labels and coordinates ) assigned to nodes (and these would be inherited by any descendants). Leaf nodes would hold data. On merge, dimensions could be bubbled up as long as length (and labels) matched. Operations with dimensions would then go down to corresponding dimension level before applying the operator, i.e. `container['A/B'].mean('time')` would be different from `container['A'].mean('time')['B']`. Datagroup and Datatree are subcases of this general structure, which could be enforced via flags/checks. Option 1 is where the extremities of the tree are a node with two sets of child nodes, dimension labels and n-dimensional arrays. Option 2 is where the extremities of the tree are a node with a child node for a n-dimensional array A, and a sibling node for each dimension of A, containing the corresponding labels. I'm sure I'm missing some big issue with the mental model I have, for instance I haven't thought of transformations at all and about coordinates. But for clarity I tried to write it down below. The most general structure for a dataset I can think of is a directed graph. Each node A is a n-dimensional (sparse) array, where each dimension D points optionally to a one-dimensional node B with the same length. To get a hierarchical structure, we: add edges of a different color, each with a label restrict their graph to a tree T add labels to each dimension D We can resolve D's target by (A) checking for a sibling in T with the same name, and then going up one level and goto (A). Multindexes ( multi-dimensional (sparse) labels ) generalize this model, but require tuple labels in T's edges i.e. : h/j/a[x,y,z] has a sybling h/j/(x,y)[x,y] , with z's labels being one level above, i.e. h/z[z] ( the notation a[b] means map of index b to value a ).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		628719058