id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 779938616,MDU6SXNzdWU3Nzk5Mzg2MTY=,4770,Interpolation always returns floats,14371165,open,0,,,1,2021-01-06T03:16:43Z,2021-01-12T16:30:54Z,,MEMBER,,,," **What happened**: When interpolating datasets integer arrays are forced to floats. **What you expected to happen**: To retain the same dtype after interpolation. **Minimal Complete Verifiable Example**: ```python import numpy as np import dask.array as da a = np.arange(0, 2) b = np.core.defchararray.add(""long_variable_name"", a.astype(str)) coords = dict(time=da.array([0, 1])) data_vars = dict() for v in b: data_vars[v] = xr.DataArray( name=v, data=da.array([0, 1], dtype=int), dims=[""time""], coords=coords, ) ds1 = xr.Dataset(data_vars) print(ds1) Out[35]: Dimensions: (time: 4) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 Data variables: long_variable_name0 (time) int32 dask.array long_variable_name1 (time) int32 dask.array # Interpolate: ds1 = ds1.interp( time=da.array([0, 0.5, 1, 2]), assume_sorted=True, method=""linear"", kwargs=dict(fill_value=""extrapolate""), ) # dask array thinks it's an integer array: print(ds1.long_variable_name0) Out[55]: dask.array Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 # But once computed it turns out is a float: print(ds1.long_variable_name0.compute()) Out[38]: array([0. , 0.5, 1. , 2. ]) Coordinates: * time (time) float64 0.0 0.5 1.0 2.0 ``` **Anything else we need to know?**: An easy first step is to also force `np.float_` in `da.blockwise` in `missing.interp_func`. The more difficult way is to somehow be able to change back the dataarrays into the old dtype without affecting performance. I did a test simply adding `.astype() `to the returned value in `missing.interp` and it doubled the calculation time. I was thinking the conversion to floats in scipy could be avoided altogether by adding a (non-)public option to ignore any dtype checks and just let the user handle the ""unsafe"" interpolations. Related: https://github.com/scipy/scipy/issues/11093 **Environment**:
Output of xr.show_versions() xr.show_versions() INSTALLED VERSIONS ------------------ commit: None python: 3.8.5 (default, Sep 3 2020, 21:29:08) [MSC v.1916 64 bit (AMD64)] python-bits: 64 OS: Windows libhdf5: 1.10.4 libnetcdf: None xarray: 0.16.2 pandas: 1.1.5 numpy: 1.17.5 scipy: 1.4.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2020.12.0 distributed: 2020.12.0 matplotlib: 3.3.2 cartopy: None seaborn: 0.11.1 numbagg: None pint: None setuptools: 51.0.0.post20201207 pip: 20.3.3 conda: 4.9.2 pytest: 6.2.1 IPython: 7.19.0 sphinx: 3.4.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4770/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue