html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/6310#issuecomment-1069092987,https://api.github.com/repos/pydata/xarray/issues/6310,1069092987,IC_kwDOAMm_X84_uRB7,22566757,2022-03-16T12:50:50Z,2022-03-16T12:50:50Z,CONTRIBUTOR,That could work. Are you set up to check that? That can be either a full repository checkout or an XArray installation you can edit.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1154014066
https://github.com/pydata/xarray/issues/6310#issuecomment-1069084130,https://api.github.com/repos/pydata/xarray/issues/6310,1069084130,IC_kwDOAMm_X84_uO3i,22566757,2022-03-16T12:40:20Z,2022-03-16T12:40:20Z,CONTRIBUTOR,"Given this:
https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L782-L783
I think that should be working. This:
https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L770-L779
explicitly says it should, and is probably the part where things go wrong, but it should be going wrong the same way for `encoding` and `attrs`.
I think
https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L758-L768
may need to be split into two conditionals, one for `attrs` and one for `encoding`. I'm not sure how to get the `continue` behavior while allowing the code to work for both `attrs` and `encoding` without code duplication.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1154014066
https://github.com/pydata/xarray/issues/6310#issuecomment-1069064616,https://api.github.com/repos/pydata/xarray/issues/6310,1069064616,IC_kwDOAMm_X84_uKGo,22566757,2022-03-16T12:17:37Z,2022-03-16T12:17:37Z,CONTRIBUTOR,"I tried to find what the CF conventions say about including dimension coordinates (I'm using [the name from scitools-iris](https://scitools-iris.readthedocs.io/en/stable/userguide/iris_cubes.html#coordinates) rather than ""coordinate variable"" as used in the CF conventions to keep myself from getting confused) in the `coordinates` attribute. From what I can tell, the whole document is consistent with usually excluding dimension coordinates from the `coordinates` attribute. Most of the [Discrete Sampling Geometry examples in appendix H](https://cfconventions.org/cf-conventions/cf-conventions.html#appendix-examples-discrete-geometries) seem to include the dimension coordinates in the `coordinates` attributes, [though at least one example](https://cfconventions.org/cf-conventions/cf-conventions.html#_orthogonal_multidimensional_array_representation_of_time_series) leaves the dimension coordinates implied rather than explicit.
From what I remember, XArray is based on the netCDF data model, rather than the CF data model, so initializing `variable_coordinates[var_name] = set(variable.dims)` will do the wrong thing if the dataset doesn't set one or more of its dimension coordinates ([example H.2](https://cfconventions.org/cf-conventions/cf-conventions.html#_orthogonal_multidimensional_array_representation_of_time_series) has variables with dimensions `(""station"", ""time"")`, but no variable named `station`. [Section 4.5](https://cfconventions.org/cf-conventions/cf-conventions.html#discrete-axis) makes this practice explicit). You could work around this by leaving the initialization as it stands but dropping the `if coordinate_name not in variable.dims` condition on including `coordinate_name` as part of the `coordinates` attribute.
> 1. Stick to the current logic which might be non-conformal with the CF conventions in case of ""Discrete Sampling Geometries"". However, users can manually fix this by setting the coordinates in encoding.
Based on this, I think doing solution one from the previous post on writing a dataset will always be consistent with CF, but assuming that netCDF files XArray reads into datasets will always follow this pattern would be a problem. I suspect there are tests for reading netCDF files with dimension coordinates included in `coordinates` attributes already, but haven't checked.
> 3. Implement a logic to recognize cases where a dataset is a ""Discrete Sampling Geometry"" and only then list the non-auxiliary coordinates in the variable attribute. This is a bit tricky, and I don't have the time to implement this, I'm afraid.
If you want to try solution three, almost all Discrete Sampling Geometry files [must have a global attribute called `featureType`](https://cfconventions.org/cf-conventions/cf-conventions.html#featureType). Since that attribute is recommended for all Discrete Sampling Geometry files, you could declare that the presence of that attribute defines a Discrete Sampling Geometry file for XArray. However, I don't see any place that says including dimension coordinates in the `coordinates` attribute is required, even for Discrete Sampling Geometry files, and a few places that explicitly say dimension coordinates can be omitted from the `coordinates` attribute, even for Discrete Sampling Geometry files.
The references from CF on whether dimension coordinates can be included in the `coordinates` attribute:
The fifth paragraph of [CF section five](https://cfconventions.org/cf-conventions/cf-conventions.html#coordinate-system) says:
> If the longitude, latitude, vertical or time coordinate is multi-valued, varies in only one dimension, and varies independently of other spatiotemporal coordinates, it is not permitted to store it as an auxiliary coordinate variable.
I *think* this is saying that if you can represent a coordinate using just one dimension, you shouldn't use two (that is, avoid using `np.tile(np.arange(10), (3, 1))` as a longitude coordinate). The other interpretation is that dimension coordinates must not be included in the `coordinates` attribute, which seems unlikely given that three lines later it says:
> Note that it is permissible, but optional, to list coordinate variables as well as auxiliary coordinate variables in the coordinates attribute.
The first paragraph of the [section on Discrete sampling geometries](https://cfconventions.org/cf-conventions/cf-conventions.html#discrete-sampling-geometries):
> Every element of every feature must be unambiguously associated with its space and time coordinates and with the feature that contains it. The coordinates attribute must be attached to every data variable to indicate the spatiotemporal coordinate variables that are needed to geo-locate the data.
I think dimension coordinates are explicit enough to count as ""unambiguously associated"", even without inclusion in the `coordinates` attribute, since they share a name with one of the dimensions of the Discrete Sampling Geometry data variables. This seems to be made explicit in the fourth paragraph:
> Auxiliary coordinate variables containing the nominal and the precise positions should be listed in the relevant coordinates attributes of data variables. In orthogonal representations the nominal positions could be coordinate variables, which do not need to be listed in the coordinates attribute, rather than auxiliary coordinate variables.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,1154014066