issues: 343659822

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
343659822	MDU6SXNzdWUzNDM2NTk4MjI=	2304	float32 instead of float64 when decoding int16 with scale_factor netcdf var using xarray	971382	closed	0			32	2018-07-23T14:35:12Z	2024-03-15T16:31:06Z	2024-03-15T16:31:06Z	NONE				Code Sample : Considering a netcdf file file with the following variable: `short agc_40hz(time, meas_ind) ; agc_40hz:_FillValue = 32767s ; agc_40hz:units = "dB" ; agc_40hz:scale_factor = 0.01 ;` Code: ```python from netCDF4 import Dataset import xarray as xr d = Dataset("test.nc") a = d.variables['agc_40hz'][:].flatten()[69] ## 21.940000000000001 'numpy.float64' x = xr.open_dataset("test.nc") b = x['agc_40hz'].values.flatten()[69] ## 21.939998626708984 'numpy.float32' abs(a - b) # 0.000001373291017 ``` Problem description : Different behaviour of xarray comparing to netCDF4 Dataset When reading the dataset with xarray we found that the decoded type was numpy.float32 instead of numpy.float64 This netcdf variable has an int16 dtype when the variable is read with the netCDF4 library directly, it is automatically converted to numpy.float64. in our case we loose on precision when using xarray. We found two solutions for this: First solution : This solution aims to prevent auto_maskandscale `python d = Dataset("test.nc") a = d.variables['agc_40hz'][:].flatten()[69] ## 21.940000000000001 'numpy.float64' x = xr.open_dataset("test.nc", mask_and_scale=False, decode_times=False) b = x['agc_40hz'].values.flatten()[69] ## 21.940000000000001 'numpy.float64' abs(a - b) # 0.000000000000000` Modification in xarray/backends/netCDF4_.py line 241 ```python def _disable_auto_decode_variable(var): """Disable automatic decoding on a netCDF4.Variable. `We handle these types of decoding ourselves. """ pass # var.set_auto_maskandscale(False) # # only added in netCDF4-python v1.2.8 # with suppress(AttributeError): # var.set_auto_chartostring(False)` ``` Second solution : This solution uses numpy.float64 whatever integer type provided. `python d = Dataset("test.nc") a = d.variables['agc_40hz'][:].flatten()[69] ## 21.940000000000001 'numpy.float64' x = xr.open_dataset("test.nc") b = x['agc_40hz'].values.flatten()[69] ## 21.940000000000001 'numpy.float64' abs(a - b) # 0.000000000000000` Modification in xarray/core/dtypes.py line 85 ```python def maybe_promote(dtype): """Simpler equivalent of pandas.core.common._maybe_promote Parameters ---------- dtype : np.dtype Returns ------- dtype : Promoted dtype that can hold missing values. fill_value : Valid missing value for the promoted dtype. """ # N.B. these casting rules should match pandas if np.issubdtype(dtype, np.floating): fill_value = np.nan elif np.issubdtype(dtype, np.integer): ######################### #OLD CODE BEGIN ######################### # if dtype.itemsize <= 2: # dtype = np.float32 # else: # dtype = np.float64 ######################### #OLD CODE END ######################### ######################### #NEW CODE BEGIN ######################### dtype = np.float64 # whether it's int16 or int32 we use float64 ######################### #NEW CODE END ######################### fill_value = np.nan elif np.issubdtype(dtype, np.complexfloating): fill_value = np.nan + np.nan * 1j elif np.issubdtype(dtype, np.datetime64): fill_value = np.datetime64('NaT') elif np.issubdtype(dtype, np.timedelta64): fill_value = np.timedelta64('NaT') else: dtype = object fill_value = np.nan return np.dtype(dtype), fill_value ``` Solution number 2 would be great for us. At this point we don't know if this modification would introduce some side effects. Is there another way to avoid this problem ? Expected Output In our case we expect the variable to be in numpy.float64 as it is done by netCDF4. Output of `xr.show_versions()` INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-23-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.10.8 pandas: 0.23.3 numpy: 1.15.0rc2 scipy: 1.1.0 netCDF4: 1.4.0 h5netcdf: None h5py: None Nio: None zarr: None bottleneck: None cyordereddict: None dask: 0.18.1 distributed: 1.22.0 matplotlib: 2.2.2 cartopy: None seaborn: None setuptools: 40.0.0 pip: 10.0.1 conda: None pytest: 3.6.3 IPython: 6.4.0 sphinx: None	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2304/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		completed	13221727	issue

Links from other tables

1 row from issues_id in issues_labels
30 rows from issue in issue_comments