html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1092#issuecomment-868324949,https://api.github.com/repos/pydata/xarray/issues/1092,868324949,MDEyOklzc3VlQ29tbWVudDg2ODMyNDk0OQ==,7611856,2021-06-25T08:36:03Z,2021-06-25T08:45:23Z,NONE,"Hey Folks, I stumbled over this discussion having a similar use case as described in some comments above: A `DataSet` with a bunch of arrays called `count_a, test_count_a, train_count_a, count_b, ... , controlled_test_mean, controlled_train_mean, ... controlled_test_sigma, ...` Obviously a hierarchical structure would help to arrange this. However, one point I didn't see in the discussion is the following: Hierarchical structures often force a user to come up with some arbitrary order of hierarchy levels. The classical example is document filing: do you put your health insurance documents under `/insurance/health/2021`, `2021/health/insurance`,....? One solution to that is a tagging of documents instead of putting them into a hierarchy. This would give the full flexibility to retrieve any flat `DataSet` out of a `TaggedDataSet` by specifying the set of tags that the individual `DataArrays` must be listed under. Back to the above example, one could think of stuff like: ```python # get a flat view (DataSet-like object) on all arrays of tagged that have the 'count' tag ds: DataSet(View) = tagged.tag_select(""count"") bar1 = ds.mean(dim=""foo"") # get a flat view (DataSet-like object) on all arrays of tagged that have the ""train and ""controlled"" tag bar2 = tagged.tag_select(""train"", ""controlled"").mean(dim=""foo"") # order of arguments to `tag_select` is irrelevant! ``` I hope it is clear what I mean, I know that there is e.g. some awesome [file system plugins](https://amoffat.github.io/supertag/index.html) (he has incredibly nice high level documentation on the topic) that use such a data model. Just wanted to add that aspect to the discussion even if it might collide with the hierarchical approach! One side note: If every array in the tagged container has exactly one tag, and tags do not repeat, then the whole thing should be semantically identical to a `DataSet` because every `tag_select` will yield a single `DataArray` - I.e. it might be possible to integrate such functionality directly into `DataSet` !?! Regards, Martin ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,187859705