id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1094725752,I_kwDOAMm_X85BQDB4,6142,dimensions: type as `str | Iterable[Hashable]`?,10194086,open,0,,,14,2022-01-05T20:39:00Z,2022-06-26T11:57:40Z,,MEMBER,,,,"### What happened? We generally type dimensions as: ```python dims: Hashable | Iterable[Hashable] ``` However, this is in conflict with passing a tuple of independent dimensions to a method - e.g. `da.mean((""x"", ""y""))` because a tuple is also hashable. Also mypy requires an `isinstance(dims, Hashable)` check when typing a function. We use an `isinstance(dims, str)` check in many places to wrap a single dimension in a list. Changing this to `isinstance(dims, Hashable)` will change the behavior for tuples. ### What did you expect to happen? In the community call today we discussed to change this to ```python dims: str | Iterable[Hashable] ``` i.e. if a single dim is passed it has to be a string and wrapping it in a list is a convenience function. Special use cases with `Hashable` types should be wrapped in a `Iterable` by the user. This probably best reflects the current state of the repo (`dims = [dims] if isinstance(dims, str) else dims`). The disadvantage could be that it is a bit more difficult to explain in the docstrings? @shoyer - did I get this right from the discussion? --- Other options 1. Require `str` as dimension names. This could be too restrictive. @keewis mentioned that tuple dimension names are already used somwehere in the xarray repo. Also we discussed in another issue or PR (which I cannot find right know) that we want to keep allowing `Hashable`. 2. Disallow passing tuples (only allow tuples if a dimension is a tuple), require lists to pass several dimensions. This is too restrictive in the other direction and will probably lead to a lot of downstream troubles. Naming a single dimension with a tuple will be a very rare case, in contrast to passing several dimension names as a tuple. 3. Special case tuples. We could potentially check if `dims` is a tuple and if there are any dimension names consisting of a tuple. Seems more complicated and potentially brittle for probably small gains (IMO). ### Minimal Complete Verifiable Example _No response_ ### Relevant log output _No response_ ### Anything else we need to know? * We need to check carefully where general `Hashable` are really allowed. E.g. `dims` of a `DataArray` are typed as https://github.com/pydata/xarray/blob/e056cacdca55cc9d9118c830ca622ea965ebcdef/xarray/core/dataarray.py#L380 but tuples are not actually allowed: ```python import xarray as xr xr.DataArray([1], dims=(""x"", ""y"")) # ValueError: different number of dimensions on data and dims: 1 vs 2 xr.DataArray([1], dims=[(""x"", ""y"")]) # TypeError: dimension ('x', 'y') is not a string ``` * We need to be careful typing functions where only one dim is allowed, e.g. `xr.concat`, which should probably set `dim: Hashable` (and make sure it works). * Do you have examples for other real-world hashable types except for `str` and `tuple`? (Would be good for testing purposes). ### Environment N/A","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6142/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue