issues: 1899895419
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1899895419 | I_kwDOAMm_X85xPhp7 | 8199 | Use Generic Types instead of Hashable or Any | 43316012 | open | 0 | 2 | 2023-09-17T19:41:39Z | 2023-09-18T14:16:02Z | COLLABORATOR | Is your feature request related to a problem?Currently, part of the static type of a DataArray or Dataset is a Consider e.g. ```python for name, da in Dataset({"a": ("t", np.arange(5))}).items():
reveal_type(name) # hashable
reveal_type(da.dims) # tuple[hashable, ...]
This could be solved by making these classes generic. Another related issue is the underlying data.
This could be introduced as a Generic type as well.
Probably, this should reach some common ground on all wrapping array libs that are out there. Every one should use a Generic Array class that keeps track of the type of the wrapped array, e.g. Describe the solution you'd likeThe implementation would be something along the lines of: ```python KeyT = TypeVar("KeyT", bound=Hashable) DataT = TypeVar("DataT", bound=<some protocol?>) class DataArray(Generic[KeyT, DataT]):
``` Now you could create a "classical" DataArray: ```python da = DataArray(np.arange(10), {"t": np.arange(10)}, dims=["t"]) will be of typeDataArray[str, np.ndarray]
will be of typeDataArray[tuple[str, str], dask.array.core.Array]``` Any whenever you access the dimensions / coord names / underlying data you will get the correct type. For now I only see three mayor problems:
1) non-array types (like lists or anything iterable) will get cast to a Describe alternatives you've consideredOne could even extend this and add more Generic types. Different types for dimensions and variable names would be a first (and probably quite a nice) feature addition. One could even go so far and type the keys and values of variables and coords (for Datasets) differently. This came up e.g. in https://github.com/pydata/xarray/issues/3967 However, this would create a ridiculous amount of Generic types and is probably more confusing than helpful. Additional contextProbably this feature should be done in consecutive PRs that each implement one Generic each, otherwise this will be a giant task! |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/8199/reactions", "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
13221727 | issue |