id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1880544087,I_kwDOAMm_X85wFtNX,8144,Add support for netcdf4 enum,16700639,closed,0,,,10,2023-09-04T15:51:45Z,2024-01-17T07:19:33Z,2024-01-17T07:19:33Z,CONTRIBUTOR,,,,"### Is your feature request related to a problem? When a netcdf file contains [netcdf4 enums ](https://unidata.github.io/netcdf4-python/#enum-data-type), xarray ignores the underlying enum type. The association between the values of the variable and their actual meaning is then lost. MRE: ```py import netCDF4 as nc import xarray as xr # -- Create dataset with an enum using the netcdf4 lib ds = nc.Dataset(""mre.nc"", ""w"", format=""NETCDF4"") cloud_type_enum = ds.createEnumType(int,""cloud_type"",{""clear"":0, ""cloudy"":1}) print(ds.enumtypes) # {'cloud_type': : name = 'cloud_type', numpy dtype = int64, fields/values ={'clear': 0, 'cloudy': 1}} ds.createVariable(""cloud"", cloud_type_enum) ds[""cloud""][0] = 1 ds.close() # -- Open dataset with xarray xr_ds = xr.open_dataset(""./mre.nc"") print(xr_ds.cloud) # \n [1 values with dtype=int64] # --> We get no metadata about the cloud_type enum that we created above xr.ds.to_netcdf(""mre_xr.nc"") # -- Open xarray outputted dataset with netCDF4 lib print(nc.Dataset(""mre_xr.nc"", ""r"", format=""NETCDF4"").enumtypes()) # {} # --> Empty dictionary: the enum we created is lost ``` If you know CF, enums could replace replace `flag_meanings` and `flag_values`, see [CF](http://cfconventions.org/cf-conventions/cf-conventions.html#flags) Enums are [not yet ](https://github.com/cf-convention/discuss/issues/238) part of CF though. ### Describe the solution you'd like As far as I understand, to describe the enum we only need a dictionary that map numbers (enum key) to string (enum value) and a way to reference this dictionary in variables that are ""typed"" to this enum. Bear in mind that the dtype of the variable would still be a number, the enum type would be a secondary metadata. ### Describe alternatives you've considered Most people that produce data could get away with using flag_meanings and flag_values to describe their data in a way which is both CF proof and properly managed by xarray. For me, the only workaround at the moment is to use the netCDF4 library directly. ### Additional context ```py nc.__version__ # 1.6.2 xr.__version__ # 2023.2.0 ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8144/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue