html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/7721#issuecomment-1516494141,https://api.github.com/repos/pydata/xarray/issues/7721,1516494141,IC_kwDOAMm_X85aY909,98330,2023-04-20T15:04:17Z,2023-04-20T15:04:17Z,NONE,"> So really, my question is: how do we support python scalars for libraries that only implement `__array_namespace__`, given that stopping to do so would be a major breaking change? I was considering this question for SciPy (xref [scipy#18286](https://github.com/scipy/scipy/issues/18286)) this week, and I think I'm happy with this strategy: 1. Cast all ""array-like"" inputs like Python scalars, lists/sequences, and generators, to `numpy.ndarray`. 2. Require ""same array type"" input, forbid mixing numpy-cupy, numpy-pytorch, cupy-pytorch, etc. - this will raise an exception 3. As a result, cupy-pyscalar and pytorch-pyscalar will _also_ raise an exception. What that results in is an API that's backwards-compatible for numpy and array-like usage, and much stricter when using other array libraries. That strictness to me is a good thing, because: - that's what CuPy, PyTorch & co themselves do, and it works well there - it avoids the complexity raised by arbitrary mixing, which results in questions like the one raised in this issue. - in case you do need to use a scalar from within a function inside your own library, just convert it explicitly to the desired array type with `xp.asarray(a_scalar)` giving you a 0-D array of the correct type (add `dtype=x.dtype` to make sure dtypes match if that matters)","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694 https://github.com/pydata/xarray/issues/7721#issuecomment-1516251985,https://api.github.com/repos/pydata/xarray/issues/7721,1516251985,IC_kwDOAMm_X85aYCtR,14808389,2023-04-20T12:35:30Z,2023-04-20T12:36:55Z,MEMBER,"there's two things that happen in `as_shared_dtype` (which may not be good design, and we should probably consider splitting it into `as_shared_dtype` and `as_compatible_arrays` or something): first, we cast everything to an array, then decide on a common dtype and cast everything to that. The latter could easily be done by using `numpy` scalars, which as far as I can tell would be supported by most array libraries, including `cupy`. However, the reason we *need* to cast to arrays is that the array API (i.e. `__array_namespace__`) does not allow using scalars of any type, e.g. `np.array_api.where` (this is important for libraries that don't implement `__array_ufunc__` / `__array_function__`). To clarify, what we're trying to support is something like ```python import numpy.array_api as np np.where(cond, cupy_array, python_scalar) ``` which (intentionally?) does not work. At the moment, `as_shared_dtype` (or, really, the hypothetical `as_compatible_arrays`) correctly casts `python_scalar` to a 0-d `cupy.array` for the example above, but if we were to replace `cupy_array` with `chunked_cupy_array` or `chunked_cupy_array_with_units`, the special casing for `cupy` stops to work and scalars will be cast to 0-d `numpy.array`. Conceptually, I tend to think of 0-d arrays as equivalent to scalars, hence the suggestion to have `cupy` treat `numpy` scalars and 0-d `numpy.array` the same way (I don't follow the array api closely enough to know whether that was already discussed and rejected). So really, my question is: how do we support python scalars for libraries that only implement `__array_namespace__`, given that stopping to do so would be a major breaking change? Of course, I would prefer removing the special casing for specific libraries, but I wouldn't be opposed to keeping the existing one. I guess as a short-term fix we could just pull `_meta` out of duck dask arrays and determine the common array type for that (the downside is that we'd add another special case for `dask`, which in another PR we're actually trying to remove). As a long-term fix I guess we'd need to revive the stalled nested duck array discussion.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694 https://github.com/pydata/xarray/issues/7721#issuecomment-1515600072,https://api.github.com/repos/pydata/xarray/issues/7721,1515600072,IC_kwDOAMm_X85aVjjI,5534781,2023-04-20T01:50:58Z,2023-04-20T01:50:58Z,NONE,"Thanks, Justus, for expanding on this. It sounds to me the question is ""how do we cast dtypes when multiple array libraries are participating in the same computation?"" and I am not sure I am knowledgable enough to make any comment. From the array API point of view, long long ago [we decided](https://github.com/data-apis/array-api/issues/399) that this is UB (undefined behavior), meaning it's completely up to each library to decide what to do. You can raise or come up with a special rule that you can make sense of. It sounds like Xarray has some machinery to deal with this situation, but you'd rather prefer to not keep special-casing for a certain array library? Am I understanding it right?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694 https://github.com/pydata/xarray/issues/7721#issuecomment-1510169910,https://api.github.com/repos/pydata/xarray/issues/7721,1510169910,IC_kwDOAMm_X85aA102,14808389,2023-04-16T08:09:09Z,2023-04-16T08:09:09Z,MEMBER,"The issue is that here: https://github.com/pydata/xarray/blob/d4db16699f30ad1dc3e6861601247abf4ac96567/xarray/core/duck_array_ops.py#L193-L206 we try to convert everything to the same dtype, casting numpy and python scalars to an array. The latter is important, because e.g. `numpy.array_api.where` only accepts arrays as input. However, detecting `cupy` beneath (multiple) layers of duckarrays is not easy, which means that for example passing a `pint(dask(cupy))` array together with scalars will currently cast the scalars to 0-d `numpy` arrays, while passing a `cupy` array instead will result in 0-d `cupy` arrays. My naive suggestion was to treat `np.int64(0)` and `np.array(0, dtype=""int64"")` the same, where at the moment the latter would fail for the same reason as `np.array([0], dtype=""int64"")`.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694 https://github.com/pydata/xarray/issues/7721#issuecomment-1510016072,https://api.github.com/repos/pydata/xarray/issues/7721,1510016072,IC_kwDOAMm_X85aAQRI,5534781,2023-04-16T01:25:53Z,2023-04-16T01:25:53Z,NONE,"Sorry that I missed the ping, Jacob, but I'd need more context for making any suggestions/answers 😅 Is the question about why CuPy wouldn't return scalars?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694 https://github.com/pydata/xarray/issues/7721#issuecomment-1498773949,https://api.github.com/repos/pydata/xarray/issues/7721,1498773949,IC_kwDOAMm_X85ZVXm9,1610850,2023-04-06T09:39:43Z,2023-04-06T09:39:43Z,CONTRIBUTOR,Ping @leofang in case you have thoughts?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1655290694