html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/2304#issuecomment-852069023,https://api.github.com/repos/pydata/xarray/issues/2304,852069023,MDEyOklzc3VlQ29tbWVudDg1MjA2OTAyMw==,18679148,2021-06-01T12:03:55Z,2021-06-07T20:48:00Z,NONE,"Dear all and thank you for your work on Xarray,

Link to @magau comment, I have a netcdf with multiple variables in different format (float, short, byte).
Using open_mfdataset 'short' and 'byte' are converted in 'float64' (no scaling, but some masking for the float data).
It doesn't raise major issue for me, but it is taking plenty of memory space for nothing.

Below an example of the 3 format from (ncdump -h):
```
	short total_nobs(time, lat, lon) ;
		total_nobs:long_name = ""Number of SSS in the time interval"" ;
		total_nobs:valid_min = 0s ;
		total_nobs:valid_max = 10000s ;
	float pct_var(time, lat, lon) ;
		pct_var:_FillValue = NaNf ;
		pct_var:long_name = ""Percentage of SSS_variability that is expected to be not explained by the products"" ;
		pct_var:units = ""%"" ;
		pct_var:valid_min = 0. ;
		pct_var:valid_max = 100. ;
	byte sss_qc(time, lat, lon) ;
		sss_qc:long_name = ""Sea Surface Salinity Quality, 0=Good; 1=Bad"" ;
		sss_qc:valid_min = 0b ;
		sss_qc:valid_max = 1b ;
```

And how they appear after opening in as xarray using open_mfdataset:
```
total_nobs (time, lat, lon) float64 dask.array<chunksize=(48, 584, 1388), meta=np.ndarray>
pct_var (time, lat, lon) float32 dask.array<chunksize=(48, 584, 1388), meta=np.ndarray>
sss_qc (time, lat, lon) float64 dask.array<chunksize=(48, 584, 1388), met
```

Is there any recommandation?
Regards","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-731253022,https://api.github.com/repos/pydata/xarray/issues/2304,731253022,MDEyOklzc3VlQ29tbWVudDczMTI1MzAyMg==,66918146,2020-11-20T15:59:13Z,2020-11-20T15:59:13Z,NONE,"Hey everyone, tumbled on this while searching for approximately the same problem. Thought I'd share since the issue is still open. On my part, there is two situations that seem buggy. I haven't been using xarray for that long yet so maybe there is something I'm missing here...

My first problem relates to the data types of dimensions with float notation. To give another answer to @shoyer's question: 

> To clarify:  why is it a problem for you

it is a problem in my case because I would like to perform slicing operations of a dataset using longitude values from another dataset. This operation raises a ""KeyError : not all values found in index 'longitude'"" since either one of the dataset's longitude is float32 and the other is float64 or because both datasets' float32 approximations are not exactly the same value in each dataset. I can work around this and assign new coords to be float64 after reading and it works, though it is kind of a hassle considering I have to perform this thousands of times. This situation also create a problem when concatenating multiple netCDF files together (along time dim in my case). The discrepancies between the approximations of float32 values or the float32 vs float 64 situation will add new dimension values where it shouldn't.

On the second part of my problem, it comes with writing/reading netCDF files (maybe more related to @daoudjahdou problem). I tried to change the data type to float64 for all my files, save them and then perform what I need to do, but for some reason even though dtype is float64 for all my dimensions when writing the files (using default args), it will sometime be float32, sometime float64 when reading the files (with default ags values) previously saved with float64 dtype. If using the default args, shouldn't the decoding makes the dtype of dimension the same for all files I read?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-462592638,https://api.github.com/repos/pydata/xarray/issues/2304,462592638,MDEyOklzc3VlQ29tbWVudDQ2MjU5MjYzOA==,791145,2019-02-12T02:48:00Z,2019-02-12T02:48:00Z,NONE,"Hi everyone,
I've start using xarray recently, so I apologize if I'm saying something wrong...
I've also faced the here reported issue, so have tried to find some answers.
Unpacking netcdf files with respect to the NUG attributes (**scale_factor** and **add_offset**) seems to be mentioned by the CF-Conventions directives. And it's clear about which data type should be applied to the unpacked data. [cf-conventions-1.7/packed-data](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/build/ch08.html#packed-data)
In this chapter you can read that: ""_If the scale_factor and add_offset attributes are of the same data type as the associated variable, the unpacked data is assumed to be of the same data type as the packed data. However, if the scale_factor and add_offset attributes are of a different data type from the variable (containing the packed data) then the unpacked data should match the type of these attributes_"".
In my opinion this should be the default behavior of the [xarray.decode_cf](http://xarray.pydata.org/en/stable/generated/xarray.decode_cf.html) function. Which doesn't invalidate the idea of forcing the unpacked data dtype.
However non of the **CFScaleOffsetCoder** and **CFMaskCoder** de/encoder classes seems to be according with these CF directives, since the first one doesn't look for the **scale_factor** or **add_offset** dtypes, and the second one also changes the unpacked data dtype (maybe because _nan_ values are being used to replace the fill values).
Sorry for such an extensive comment, without any solutions proposal...
Regards! :+1: ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-451984471,https://api.github.com/repos/pydata/xarray/issues/2304,451984471,MDEyOklzc3VlQ29tbWVudDQ1MTk4NDQ3MQ==,971382,2019-01-07T16:04:11Z,2019-01-07T16:04:11Z,NONE,"Hi,
thank you for your effort into making xarray a great library.
As mentioned in the issue the discussion went on a PR in order to make xr.open_dataset parametrable.
This post is about asking you about recommendations regarding our PR.

In this case we would add a parameter to the open_dataset function called ""force_promote"" which is a boolean and False by default and thus not mandatory.
And then spread that parameter down to the function maybe_promote in dtypes.py
Where we say the following:

if dtype.itemsize <= 2 and not force_promote:
    dtype = np.float32
else:
    dtype = np.float64

The downside of that is that we somehow pollute the code with a parameter that is used in a specific case.

The second approach would  check the value of an environment variable called ""XARRAY_FORCE_PROMOTE""  if it exists and set to true would force promoting type to float64.

please tells us which approach suits best your vision of xarray.

Regards.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-412492776,https://api.github.com/repos/pydata/xarray/issues/2304,412492776,MDEyOklzc3VlQ29tbWVudDQxMjQ5Mjc3Ng==,971382,2018-08-13T11:51:15Z,2018-08-13T11:51:15Z,NONE,Any updates about this ?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-410678021,https://api.github.com/repos/pydata/xarray/issues/2304,410678021,MDEyOklzc3VlQ29tbWVudDQxMDY3ODAyMQ==,971382,2018-08-06T11:31:00Z,2018-08-06T11:31:00Z,NONE,"As mentioned in the original issue the modification is straightforward.
Any ideas if this could be integrated to xarray anytime soon ?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822
https://github.com/pydata/xarray/issues/2304#issuecomment-407092265,https://api.github.com/repos/pydata/xarray/issues/2304,407092265,MDEyOklzc3VlQ29tbWVudDQwNzA5MjI2NQ==,971382,2018-07-23T15:10:13Z,2018-07-23T15:10:13Z,NONE,"Thank you for your quick answer.
In our case we could evaluate std dev or square sums on long lists of values and the accumulation of those small values due to float32 type could create considerable differences.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,343659822