github: issues: 2 rows where comments = 2, state = "open" and user = 43316012 sorted by updated

2 rows where comments = 2, state = "open" and user = 43316012 sorted by updated_at descending

Search:

descending

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at ▲	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
1899895419	I_kwDOAMm_X85xPhp7	8199	Use Generic Types instead of Hashable or Any	headtr1ck 43316012	open	0			2	2023-09-17T19:41:39Z	2023-09-18T14:16:02Z		COLLABORATOR				Is your feature request related to a problem? Currently, part of the static type of a DataArray or Dataset is a `Mapping[Hashable, DataArray]`. I'm quite sure that 99% of the users will actually use `str` key values (aka. variable names), while some exotic people (me included) want to use e.g. Enums for their keys. Currently, we allow to use anything as keys as long as it is hashable, but once the DataArray/set is created, the type information of the keys is lost. Consider e.g. ```python for name, da in Dataset({"a": ("t", np.arange(5))}).items(): reveal_type(name) # hashable reveal_type(da.dims) # tuple[hashable, ...] `` Woudn't that be nice if this would actually returnstr`, so you don't have to cast it or assert it everytime? This could be solved by making these classes generic. Another related issue is the underlying data. This could be introduced as a Generic type as well. Probably, this should reach some common ground on all wrapping array libs that are out there. Every one should use a Generic Array class that keeps track of the type of the wrapped array, e.g. `dask.array.core.Array[np.ndarray]`. In return, we could do `DataArray[np.ndarray]` or then `DataArray[dask.array.core.Array[nd.ndarray]]`. Describe the solution you'd like The implementation would be something along the lines of: ```python KeyT = TypeVar("KeyT", bound=Hashable) DataT = TypeVar("DataT", bound=<some protocol?>) class DataArray(Generic[KeyT, DataT]): `_coords: dict[KeyT, Variable[DataT]] _indexes: dict[KeyT, Index[DataT]] _name: KeyT \| None _variable: Variable[DataT] def __init__( self, data: DataT = dtypes.NA, coords: Sequence[Sequence[DataT] \| pd.Index \| DataArray[KeyT]] \| Mapping[KeyT, DataT] \| None = None, dims: str \| Sequence[KeyT] \| None = None, name: KeyT \| None = None, attrs: Mapping[KeyT, Any] \| None = None, # internal parameters indexes: Mapping[KeyT, Index] \| None = None, fastpath: bool = False, ) -> None: ...` ``` Now you could create a "classical" DataArray: ```python da = DataArray(np.arange(10), {"t": np.arange(10)}, dims=["t"]) will be of type DataArray[str, np.ndarray] `while you could also create something more fancy`python da2 = DataArray(dask.array.array([1, 2, 3]), {}, dims=[("tup1", "tup2),]) will be of type DataArray[tuple[str, str], dask.array.core.Array] ``` Any whenever you access the dimensions / coord names / underlying data you will get the correct type. For now I only see three mayor problems: 1) non-array types (like lists or anything iterable) will get cast to a `np.ndarray` and I have no idea how to tell the type checker that `DataArray([1, 2, 3], {}, "a")` should be `DataArray[str, np.ndarray]` and not `DataArray[str, list[int]]`. Depending on the Protocol in the bound TypeVar this might even fail static type analysis or require tons of special casing and overloads. 2) How does the type checker extract the dimension type for Datasets? This is quite convoluted and I am not sure this can be typed correctly... 3) The parallel compute workflows are quite dynamic and I am not sure if static type checking can keep track of the underlying datatype... What does `DataArray([1, 2, 3], dims="a").chunk({"a": 2})` return? Is it `DataArray[str, dask.array.core.Array]`? But what about other chunking frameworks? Describe alternatives you've considered One could even extend this and add more Generic types. Different types for dimensions and variable names would be a first (and probably quite a nice) feature addition. One could even go so far and type the keys and values of variables and coords (for Datasets) differently. This came up e.g. in https://github.com/pydata/xarray/issues/3967 However, this would create a ridiculous amount of Generic types and is probably more confusing than helpful. Additional context Probably this feature should be done in consecutive PRs that each implement one Generic each, otherwise this will be a giant task!	{ "url": "https://api.github.com/repos/pydata/xarray/issues/8199/reactions", "total_count": 5, "+1": 5, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			xarray 13221727	issue
1395053809	PR_kwDOAMm_X85AEpA1	7117	Expermimental mypy plugin	headtr1ck 43316012	open	0			2	2022-10-03T17:07:59Z	2022-10-03T18:53:10Z		COLLABORATOR		1	pydata/xarray/pulls/7117	I was playing around a bit with a mypy plugin and this was the best I could come up with. Unfortunately the mypy docu about the plugins is not very detailed... This plugin makes mypy recognize the user defined accessors. There is a quite severe bug in there (due to my lack of understanding of mypy internals probably) which makes it work only on the first run but when you change a line in your code and run mypy again it will crash... (you can delete the cache to make it work one more time again :) Any chance that a mypy expert can figure this out? haha	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7117/reactions", "total_count": 1, "+1": 0, "-1": 0, "laugh": 0, "hooray": 1, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			xarray 13221727	pull

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);

issues

2 rows where comments = 2, state = "open" and user = 43316012 sorted by updated_at descending

Is your feature request related to a problem?

Describe the solution you'd like

will be of type

DataArray[str, np.ndarray]

will be of type

DataArray[tuple[str, str], dask.array.core.Array]

Describe alternatives you've considered

Additional context

Advanced export