html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/7065#issuecomment-1255073449,https://api.github.com/repos/pydata/xarray/issues/7065,1255073449,IC_kwDOAMm_X85Kzuap,4160723,2022-09-22T14:04:22Z,2022-09-22T14:05:56Z,MEMBER,"Actually there's another conversion when you reuse an xarray dimension coordinate in array-like computations:

```python
ds = xr.Dataset(coords={""x"": np.array([1.2, 1.3, 1.4], dtype=np.float16)})

# coordinate data is a wrapper around a pandas.Index object
# (it keeps track of the original array dtype)
ds.variables[""x""]._data
# PandasIndexingAdapter(array=Float64Index([1.2001953125, 1.2998046875, 1.400390625], dtype='float64', name='x'), dtype=dtype('float16'))

# This coerces the pandas.Index back as a numpy array
np.asarray(ds.x)
# array([1.2, 1.3, 1.4], dtype=float16)

# which is equivalent to
ds.variables[""x""]._data.__array__()
# array([1.2, 1.3, 1.4], dtype=float16)
```

The round-trip conversion preserves the original dtype so different execution times may be expected.

I can't tell much why the results are different (how much are they different?), but I wouldn't be surprised if it's caused by rounding errors accumulated through the computation of a complex formula like haversine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1381955373
https://github.com/pydata/xarray/issues/7065#issuecomment-1255014363,https://api.github.com/repos/pydata/xarray/issues/7065,1255014363,IC_kwDOAMm_X85Kzf_b,4160723,2022-09-22T13:19:23Z,2022-09-22T13:19:23Z,MEMBER,"> As my latitude and longitude arrays in both datasets have a resolution of 0.1 degrees, wouldn't it make sense to use np.float16 for both arrays?

I don't think so (at least not currently). The numpy arrays are by default converted to `pandas.Index` objects for each dimension coordinate, and for floats there's only `pandas.Float64Index`. It looks like it will be depreciated in favor of `pandas.NumericIndex` that supports more dtypes, but still [I don't see support for 16 bits floats](https://github.com/pandas-dev/pandas/blob/main/pandas/core/indexes/numeric.py#L95-L108).

Regarding your nearest lat/lon point data selection problem, this is something that could probably be better solved using more specific  (custom) indexes like the ones available in [xoak](https://xoak.readthedocs.io/en/latest/). Xoak only supports point-wise selection at the moment, though.
 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1381955373
https://github.com/pydata/xarray/issues/7065#issuecomment-1254983291,https://api.github.com/repos/pydata/xarray/issues/7065,1254983291,IC_kwDOAMm_X85KzYZ7,4160723,2022-09-22T12:54:43Z,2022-09-22T12:54:43Z,MEMBER,"> The problem is that I tried to merge with join='override' but it was still taking a long time. Probably I wasn't using the right order.

Not 100% sure but maybe `xr.merge` loads all the data from your datasets and performs some equality checks. Perhaps you could see how much time it takes after loading all the data, or try different `xr.merge(compat=)` values?

> Before closing, just a curiosity: in this corner case shouldn't xarray cast automatically the lat,lon coordinate arrays to the same dtype or is it a dangerous assumption?

We already do this for label indexers that are passed to `.sel()`. However, for alignment I think that it would require re-building an index for every cast coordinate, which may be expensive and is probably no ideal if done automatically.  ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1381955373
https://github.com/pydata/xarray/issues/7065#issuecomment-1254862548,https://api.github.com/repos/pydata/xarray/issues/7065,1254862548,IC_kwDOAMm_X85Ky67U,4160723,2022-09-22T10:58:10Z,2022-09-22T10:58:36Z,MEMBER,"Hi @guidocioni.

I see that the longitude and latitude coordinates both have different `dtype` in the two input datasets, which likely explains why you have many NaNs and larger sizes (almost 2x) for the `lat` and `lon` dimensions in the resulting dataset.

Here's a small reproducible example:

```python
import numpy as np
import xarray as xr


lat = np.random.uniform(0, 40, size=100)
lon = np.random.uniform(0, 180, size=100)

ds1 = xr.Dataset(
    coords={""lon"": lon.astype(np.float32), ""lat"": lat.astype(np.float32)}
)
ds2 = xr.Dataset(
    coords={""lon"": lon, ""lat"": lat}
)

ds1.indexes[""lat""].equals(ds2.indexes[""lat""])
# False

xr.merge([ds1, ds2], join=""exact"")
# ValueError: cannot align objects with join='exact' where index/labels/sizes
# are not equal along these coordinates (dimensions): 'lon' ('lon',)
```

If coordinates labels differ only by their encoding, you could use `xr.merge([ds1, ds2], join=""override"")`, which will take the coordinates from the 1st object.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1381955373