id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1414669747,I_kwDOAMm_X85UUiWz,7186,netCDF4: support byte strings as attribute values,64479100,open,0,,,2,2022-10-19T09:58:04Z,2023-01-17T18:30:20Z,,NONE,,,,"### What is your issue? When I have a string attribute with special characters like '°' or German Umlauts (Ä, Ü, etc) it will get written to file as type NC_STRING. Other string attributes not containing any special characters will be saved as NC_CHAR. This leads to problems when I subsequently want to open this file with NetCDF-Fortran, because it does not fully support NC_STRING. So my question is: Is there a way to force xarray to write the string attribute as NC_CHAR? **Example** ```python import numpy as np import xarray as xr data = np.ones([12, 10]) ds = xr.Dataset({""data"": ([""x"", ""y""], data)}, coords={""x"": np.arange(12), ""y"": np.arange(10)}) ds[""x""].attrs[""first_str""] = ""foo"" ds[""x""].attrs[""second_str""] = ""bar°"" ds[""x""].attrs[""third_str""] = ""hää"" ds.to_netcdf(""testds.nc"") ``` The output of `ncdump -h` looks like this, which shows the different data type of the second and third attribute: ![grafik](https://user-images.githubusercontent.com/64479100/196659189-02d20f79-f3a3-4bd4-8609-faf49dad4236.png) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7186/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1516234283,I_kwDOAMm_X85aX-Yr,7406,Grid mapping not saved in attributes when extra encoding is specified,64479100,open,0,,,1,2023-01-02T10:17:07Z,2023-01-15T15:46:55Z,,NONE,,,,"### What is your issue? When `decode_coords=""all""` is used to load a NetCDF file, references to grid mapping variables are stored in the data variables encoding. When it is saved to disk again, the grid mapping name should be added again to the data variables attributes to be CF-conform. This seems not to be the case when using `to_netcdf()` with extra encoding info for the respective data variable. I will use the following NetCDF file to illustrate the issue: ``` netcdf TL_2021_in { dimensions: x = 654 ; y = 866 ; time = 24 ; variables: float transverse_mercator ; transverse_mercator:_FillValue = NaNf ; string transverse_mercator:crs_wkt = ""PROJCRS[\""DHDN / 3-degree Gauss-Kruger zone 3\"",BASEGEOGCRS[\""DHDN\"",DATUM[\""Deutsches Hauptdreiecksnetz\"",ELLIPSOID[\""Bessel 1841\"",6377397.155,299.1528128,LENGTHUNIT[\""metre\"",1]]],PRIMEM[\""Greenwich\"",0,ANGLEUNIT[\""degree\"",0.0174532925199433]],ID[\""EPSG\"",4314]],CONVERSION[\""3-degree Gauss-Kruger zone 3\"",METHOD[\""Transverse Mercator\"",ID[\""EPSG\"",9807]],PARAMETER[\""Latitude of natural origin\"",0,ANGLEUNIT[\""degree\"",0.0174532925199433],ID[\""EPSG\"",8801]],PARAMETER[\""Longitude of natural origin\"",9,ANGLEUNIT[\""degree\"",0.0174532925199433],ID[\""EPSG\"",8802]],PARAMETER[\""Scale factor at natural origin\"",1,SCALEUNIT[\""unity\"",1],ID[\""EPSG\"",8805]],PARAMETER[\""False easting\"",3500000,LENGTHUNIT[\""metre\"",1],ID[\""EPSG\"",8806]],PARAMETER[\""False northing\"",0,LENGTHUNIT[\""metre\"",1],ID[\""EPSG\"",8807]]],CS[Cartesian,2],AXIS[\""northing (X)\"",north,ORDER[1],LENGTHUNIT[\""metre\"",1]],AXIS[\""easting (Y)\"",east,ORDER[2],LENGTHUNIT[\""metre\"",1]],USAGE[SCOPE[\""unknown\""],AREA[\""Germany - West-Germany - 7.5°E to 10.5°E\""],BBOX[47.27,7.5,55.09,10.51]],ID[\""EPSG\"",31467]]"" ; transverse_mercator:semi_major_axis = 6377397.f ; transverse_mercator:semi_minor_axis = 6356079.f ; transverse_mercator:inverse_flattening = 299.1528f ; transverse_mercator:reference_ellipsoid_name = ""Bessel 1841"" ; transverse_mercator:longitude_of_prime_meridian = 0.f ; transverse_mercator:prime_meridian_name = ""Greenwich"" ; transverse_mercator:geographic_crs_name = ""DHDN"" ; transverse_mercator:horizontal_datum_name = ""Deutsches Hauptdreiecksnetz"" ; transverse_mercator:projected_crs_name = ""DHDN / 3-degree Gauss-Kruger zone 3"" ; transverse_mercator:grid_mapping_name = ""transverse_mercator"" ; transverse_mercator:latitude_of_projection_origin = 0.f ; transverse_mercator:longitude_of_central_meridian = 9.f ; transverse_mercator:false_easting = 3500000.f ; transverse_mercator:false_northing = 0.f ; transverse_mercator:scale_factor_at_central_meridian = 1.f ; transverse_mercator:spatial_ref = ""PROJCS[\""DHDN / 3-degree Gauss-Kruger zone 3\"",GEOGCS[\""DHDN\"",DATUM[\""Deutsches_Hauptdreiecksnetz\"",SPHEROID[\""Bessel 1841\"",6377397.155,299.1528128,AUTHORITY[\""EPSG\"",\""7004\""]],AUTHORITY[\""EPSG\"",\""6314\""]],PRIMEM[\""Greenwich\"",0,AUTHORITY[\""EPSG\"",\""8901\""]],UNIT[\""degree\"",0.0174532925199433,AUTHORITY[\""EPSG\"",\""9122\""]],AUTHORITY[\""EPSG\"",\""4314\""]],PROJECTION[\""Transverse_Mercator\""],PARAMETER[\""latitude_of_origin\"",0],PARAMETER[\""central_meridian\"",9],PARAMETER[\""scale_factor\"",1],PARAMETER[\""false_easting\"",3500000],PARAMETER[\""false_northing\"",0],UNIT[\""metre\"",1,AUTHORITY[\""EPSG\"",\""9001\""]],AUTHORITY[\""EPSG\"",\""31467\""]]"" ; int x(x) ; x:standard_name = ""projection_x_coordinate"" ; x:long_name = ""x coordinate of projection"" ; x:units = ""m"" ; int y(y) ; y:standard_name = ""projection_y_coordinate"" ; y:long_name = ""y coordinate of projection"" ; y:units = ""m"" ; int time(time) ; time:timezone = ""UTC"" ; time:units = ""hours since 2020-12-31T23:00:00"" ; time:calendar = ""proleptic_gregorian"" ; short tas(time, y, x) ; tas:_FillValue = -9999s ; tas:standard_name = ""air_temperature"" ; tas:long_name = ""Near-Surface Air Temperature"" ; tas:units = ""degree_C"" ; tas:cell_methods = ""time: mean"" ; tas:scale_factor = 0.1f ; tas:grid_mapping = ""transverse_mercator"" ; } ``` With the code ``` ds = xr.open_dataset(""TL_2021_in.nc"", decode_coords=""all"") ds.to_netcdf(""TL_2021_out.nc"") ``` the file will be written correctly to disk, the output of `ncdump` is the same for `TL_2021_in.nc` and `TL_2021_out.nc`. But when I save with extra encoding information fot the data variable `tas`, f.e. with ``` ds.to_netcdf(""TL_2021_out.nc"", encoding={""tas"": {""dtype"": ""float32""}}) ``` the `grid_mapping` attribute is missing in the file for this variable. Edit: `scale_factor` is also missing and `_FillValue` is now `NaNf` instead of `-9999s`. I suspect this is because of the possibility that the previous encoding might become invalid when the dataset is changed or encoded differently? But this makes it quite hard not to lose the grid mapping info, which should normally not be affected by data type, fill value or similar. I use python 3.10.5 and xarray 2022.3.0.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7406/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue