issue_comments: 490821558

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2292#issuecomment-490821558	https://api.github.com/repos/pydata/xarray/issues/2292	490821558	MDEyOklzc3VlQ29tbWVudDQ5MDgyMTU1OA==	47244312	2019-05-09T09:04:21Z	2019-05-09T09:05:48Z	CONTRIBUTOR	There are problems with typing. I already mentioned them in #2929 but I'll summarize here. The vast majority of xarray functions/methods allow for "string or sequence of strings, optional". When you move to "hashable or sequence of hashables, optional", however, you want to specifically avoid tuples, which are both Sequence and Hashable instances. Most functions currently look like this: `if isinstance(x, str): x = [x] elif x is None: x = [DEFAULT] for xi in x: ...` After the change they would become: `if x is None: x = [DEFAULT] elif isinstance(x, Hashable) and not isinstance(x, tuple): x = [x] for xi in x: ...` Or: `if x is None: x = [DEFAULT] elif isinstance(x, str) or not isinstance(x, Sequence): x = [x] for xi in x: ...` Note how I moved the test for None above. This matters, because `isinstance(None, Hashable)` returns True. This is very error-prone and expensive to maintain, which will very easily cause beginner contributors to introduce bugs. Every test that currently runs three use cases, one for None, one for str and another for a sequence of str, will now be forced to be expanded to SIX test cases: str tuple (hashable sequence) of str list (non-hashable sequence) of str enum (non-str, non-sequence hashable) sequence of non-sortable hashables None One way to mitigate it would be to have an helper function, which would be invoked everywhere around the codebase, and then religiously make sure that the helper function is always used. `_no_default = [object()] def ensure_sequence(name: str, x: Union[Hashable, Sequence[Hashable]], default: Sequence[Hashable] = _no_default) -> Sequence[Hashable]: if x is None: if default is _no_default: raise ValueError(name + ' must be explicitly defined') return default if isinstance(x, Sequence) and not isinstance(x, str): return x if isinstance(x, Hashable): return [x] raise TypeError(name + ' must be a Hashable or Sequence of Hashable')` You would still be forced to implement the test for non-sortable hashables, though. A completely separate problem with typing is that I expect a huge amount of xarray users to just assume variable names and dims are always strings. They'll have things like `for k, v in ds.data_vars: if k.startswith('foo'): ...` or `[dim for dim in da.dims if "foo" in dim]` The above will fill the mypy output with errors as soon as xarray becomes integrated in mypy (#2929), and the user will have to go through a lot of explicitly forcing dims and variable names to str, even if in their project all dims and variables names are always str. The final problem is that integers are Hashables, and there's a wealth of cases in xarray where there is special logic that dynamically treats ints as positional indices.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		341643235