id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 757998857,MDU6SXNzdWU3NTc5OTg4NTc=,4655,ComplexWarning: Casting complex values to real discards the imaginary part,12728415,closed,0,,,2,2020-12-06T19:09:26Z,2023-03-29T16:02:01Z,2023-03-29T16:01:54Z,NONE,,,,"xarray version 0.16.2/ Python 3.8.5 When reading a dataset containing complex variables using Dataset.open_zarr method the following warning appears: __/home/.../python3.8/site-packages/xarray/coding/variables.py:218: ComplexWarning: Casting complex values to real discards the imaginary part_ And the imaginary part is effectively discarded which is not what I expected. After a slightly more in-depth analysis I came across the function (xarray/coding/variables.py:226) ```python def _choose_float_dtype(dtype, has_offset): """"""Return a float dtype that can losslessly represent `dtype` values."""""" # Keep float32 as-is. Upcast half-precision to single-precision, # because float16 is ""intended for storage but not computation"" if dtype.itemsize <= 4 and np.issubdtype(dtype, np.floating): return np.float32 # float32 can exactly represent all integers up to 24 bits if dtype.itemsize <= 2 and np.issubdtype(dtype, np.integer): # A scale factor is entirely safe (vanishing into the mantissa), # but a large integer offset could lead to loss of precision. # Sensitivity analysis can be tricky, so we just use a float64 # if there's any offset at all - better unoptimised than wrong! if not has_offset: return np.float32 # For all other types and circumstances, we just use float64. # (safe because eg. complex numbers are not supported in NetCDF) return np.float64 ``` For me, this behavior is strange, I find more natural to use the stored type rather than to make a systematic transformation into a float. To test, I have modified the decode method (xarray/coding/variables.py:265) ```python def decode(self, variable, name=None): dims, data, attrs, encoding = unpack_for_decoding(variable) if ""scale_factor"" in attrs or ""add_offset"" in attrs: scale_factor = pop_to(attrs, encoding, ""scale_factor"", name=name) add_offset = pop_to(attrs, encoding, ""add_offset"", name=name) # my change # dtype = _choose_float_dtype(data.dtype, ""add_offset"" in attrs) dtype = data.dtype if np.ndim(scale_factor) > 0: scale_factor = scale_factor.item() if np.ndim(add_offset) > 0: add_offset = add_offset.item() transform = partial( _scale_offset_decoding, scale_factor=scale_factor, add_offset=add_offset, dtype=dtype, ) data = lazy_elemwise_func(data, transform, dtype) return Variable(dims, data, attrs, encoding) ``` and it is working as I expected. If there is a good reason to keep things as they are, can you explain me how to deal with complex data without creating a new variable? Thank you for your great job, xarray is awesome.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4655/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue