id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1194993450,I_kwDOAMm_X85HOicq,6448,Writing GDAL ZARR _CRS attribute not possible,15717873,closed,0,,,12,2022-04-06T18:37:34Z,2022-09-04T14:21:13Z,2022-06-03T18:48:47Z,NONE,,,,"### What is your issue? Related to https://github.com/pydata/xarray/issues/6374 Writing a ZARR which is compatible with [GDAL conventions](https://gdal.org/drivers/raster/zarr.html) using `xarray.Dataset.to_zarr` requires all the data variables to have a `_CRS` attribute which contains the [Spatial Reference System encoding (SRS)](https://gdal.org/drivers/raster/zarr.html#srs-encoding). This `_CRS` attribute itself is a `dict` in which the SRS is encoded in at least one of these keys: `wkt`, `url`, `projjson` **Because attribute values can't be dictionaries during serialization, it does not seem possible to write GDAL compatible zarrs using xarray.** Example: lets assume we have a Dataset `ds` like this: ``` Dimensions: (Y: 180, X: 360) Coordinates: * X (X) float64 -179.5 -178.5 -177.5 -176.5 ... 176.5 177.5 178.5 179.5 * Y (Y) float64 89.5 88.5 87.5 86.5 85.5 ... -86.5 -87.5 -88.5 -89.5 Data variables: Band1 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 Band2 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 Band3 (Y, X) uint16 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 ``` lets also assume we want to encode the `_CRS` as `wkt` like so: ``` python wkt = 'GEOGCS[""WGS 84"",DATUM[""WGS_1984"",SPHEROID[""WGS 84"",6378137,298.257223563,AUTHORITY[""EPSG"",""7030""]],AUTHORITY[""EPSG"",""6326""]],PRIMEM[""Greenwich"",0,AUTHORITY[""EPSG"",""8901""]],UNIT[""degree"",0.0174532925199433,AUTHORITY[""EPSG"",""9122""]],AXIS[""Latitude"",NORTH],AXIS[""Longitude"",EAST],AUTHORITY[""EPSG"",""4326""]]' ``` (encoding the _CRS in any of the other 2 formats results in the same problem at the end) Setting the attributes of each data variable: ``` python attributes = { ""_ARRAY_DIMENSIONS"": ['Y', 'X'], ""_CRS"": {""wkt"": wkt}, ""AREA_OR_POINT"": 'Area', } for data_var in ds.data_vars: ds[data_var].attrs = attributes ``` no problem so far, `ds.Band1.attrs` results in: ``` python { ""_ARRAY_DIMENSIONS"": [""Y"", ""X""], ""_CRS"": { ""wkt"": 'GEOGCS[""WGS 84"",DATUM[""WGS_1984"",SPHEROID[""WGS 84"",6378137,298.257223563,AUTHORITY[""EPSG"",""7030""]],AUTHORITY[""EPSG"",""6326""]],PRIMEM[""Greenwich"",0,AUTHORITY[""EPSG"",""8901""]],UNIT[""degree"",0.0174532925199433,AUTHORITY[""EPSG"",""9122""]],AXIS[""Latitude"",NORTH],AXIS[""Longitude"",EAST],AUTHORITY[""EPSG"",""4326""]]' }, ""AREA_OR_POINT"": ""Area"", } ``` the problem now occurs with writing the dataset using: ``` python ds.to_zarr(""test.zarr"", consolidated=True) ``` ``` TypeError: Invalid value for attr '_CRS': {'wkt': 'GEOGCS[""WGS 84"",DATUM[""WGS_1984"",SPHEROID[""WGS 84"",6378137,298.257223563,AUTHORITY[""EPSG"",""7030""]],AUTHORITY[""EPSG"",""6326""]],PRIMEM[""Greenwich"",0,AUTHORITY[""EPSG"",""8901""]],UNIT[""degree"",0.0174532925199433,AUTHORITY[""EPSG"",""9122""]],AXIS[""Latitude"",NORTH],AXIS[""Longitude"",EAST],AUTHORITY[""EPSG"",""4326""]]'}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6448/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue