issues: 295959111
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
295959111 | MDU6SXNzdWUyOTU5NTkxMTE= | 1900 | Representing & checking Dataset schemas | 5635139 | open | 0 | 15 | 2018-02-09T18:06:08Z | 2022-07-14T11:28:37Z | MEMBER | What would be the best way to canonically describe a dataset, which could be read by both humans and machines? For example, frequently in our code we have docstrings which look something like: ``` def get_returns(security_ids): """ Retuns mega-dimensional dataset which gives recent returns for a set of securities by: - Date - Return (raw / economic / smoothed / etc) - Scaling (constant / risk_scaled) - Span - Hedged vs Unhedged
``` This helps when attempting to understand what code is doing while only reading it. But this isn't consistent between docstrings and can't be read or checked by a machine. Has anyone solved this problem / have any suggestions for resources out there? Tangentially related to https://github.com/python/typing/issues/513 (but our issues are less about the type, dimension sizes, and more about the arrays within a dataset, their dimensions, and their names) |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/1900/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |