id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
295959111,MDU6SXNzdWUyOTU5NTkxMTE=,1900,Representing & checking Dataset schemas  ,5635139,open,0,,,15,2018-02-09T18:06:08Z,2022-07-14T11:28:37Z,,MEMBER,,,,"What would be the best way to canonically describe a dataset, which could be read by both humans and machines?

For example, frequently in our code we have docstrings which look something like:

```
def get_returns(security_ids):
    """"""
    Retuns mega-dimensional dataset which gives recent returns for a set of
        securities by:
    - Date
    - Return (raw / economic / smoothed / etc)
    - Scaling (constant / risk_scaled)
    - Span
    - Hedged vs Unhedged

    Dataset keys are security ids. All dimensions have coords.
    """"""
```

This helps when attempting to understand what code is doing while only reading it. 
But this isn't consistent between docstrings and can't be read or checked by a machine. 
Has anyone solved this problem / have any suggestions for resources out there?

Tangentially related to https://github.com/python/typing/issues/513 (but our issues are less about the type, dimension sizes, and more about the arrays within a dataset, their dimensions, and their names)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1900/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue