home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "CONTRIBUTOR", issue = 1154014066 and user = 22566757 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • DWesl · 3 ✖

issue 1

  • Only auxiliary coordinates are listed in nc variable attribute · 3 ✖

author_association 1

  • CONTRIBUTOR · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1069092987 https://github.com/pydata/xarray/issues/6310#issuecomment-1069092987 https://api.github.com/repos/pydata/xarray/issues/6310 IC_kwDOAMm_X84_uRB7 DWesl 22566757 2022-03-16T12:50:50Z 2022-03-16T12:50:50Z CONTRIBUTOR

That could work. Are you set up to check that? That can be either a full repository checkout or an XArray installation you can edit.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Only auxiliary coordinates are listed in nc variable attribute 1154014066
1069084130 https://github.com/pydata/xarray/issues/6310#issuecomment-1069084130 https://api.github.com/repos/pydata/xarray/issues/6310 IC_kwDOAMm_X84_uO3i DWesl 22566757 2022-03-16T12:40:20Z 2022-03-16T12:40:20Z CONTRIBUTOR

Given this: https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L782-L783 I think that should be working. This: https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L770-L779 explicitly says it should, and is probably the part where things go wrong, but it should be going wrong the same way for encoding and attrs.

I think https://github.com/pydata/xarray/blob/613a8fda4f07181fbc41d6ff2296fec3726fd351/xarray/conventions.py#L758-L768 may need to be split into two conditionals, one for attrs and one for encoding. I'm not sure how to get the continue behavior while allowing the code to work for both attrs and encoding without code duplication.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Only auxiliary coordinates are listed in nc variable attribute 1154014066
1069064616 https://github.com/pydata/xarray/issues/6310#issuecomment-1069064616 https://api.github.com/repos/pydata/xarray/issues/6310 IC_kwDOAMm_X84_uKGo DWesl 22566757 2022-03-16T12:17:37Z 2022-03-16T12:17:37Z CONTRIBUTOR

I tried to find what the CF conventions say about including dimension coordinates (I'm using the name from scitools-iris rather than "coordinate variable" as used in the CF conventions to keep myself from getting confused) in the coordinates attribute. From what I can tell, the whole document is consistent with usually excluding dimension coordinates from the coordinates attribute. Most of the Discrete Sampling Geometry examples in appendix H seem to include the dimension coordinates in the coordinates attributes, though at least one example leaves the dimension coordinates implied rather than explicit.

From what I remember, XArray is based on the netCDF data model, rather than the CF data model, so initializing variable_coordinates[var_name] = set(variable.dims) will do the wrong thing if the dataset doesn't set one or more of its dimension coordinates (example H.2 has variables with dimensions ("station", "time"), but no variable named station. Section 4.5 makes this practice explicit). You could work around this by leaving the initialization as it stands but dropping the if coordinate_name not in variable.dims condition on including coordinate_name as part of the coordinates attribute.

  1. Stick to the current logic which might be non-conformal with the CF conventions in case of "Discrete Sampling Geometries". However, users can manually fix this by setting the coordinates in encoding.

Based on this, I think doing solution one from the previous post on writing a dataset will always be consistent with CF, but assuming that netCDF files XArray reads into datasets will always follow this pattern would be a problem. I suspect there are tests for reading netCDF files with dimension coordinates included in coordinates attributes already, but haven't checked.

  1. Implement a logic to recognize cases where a dataset is a "Discrete Sampling Geometry" and only then list the non-auxiliary coordinates in the variable attribute. This is a bit tricky, and I don't have the time to implement this, I'm afraid.

If you want to try solution three, almost all Discrete Sampling Geometry files must have a global attribute called featureType. Since that attribute is recommended for all Discrete Sampling Geometry files, you could declare that the presence of that attribute defines a Discrete Sampling Geometry file for XArray. However, I don't see any place that says including dimension coordinates in the coordinates attribute is required, even for Discrete Sampling Geometry files, and a few places that explicitly say dimension coordinates can be omitted from the coordinates attribute, even for Discrete Sampling Geometry files.

The references from CF on whether dimension coordinates can be included in the coordinates attribute:

The fifth paragraph of CF section five says:

If the longitude, latitude, vertical or time coordinate is multi-valued, varies in only one dimension, and varies independently of other spatiotemporal coordinates, it is not permitted to store it as an auxiliary coordinate variable.

I think this is saying that if you can represent a coordinate using just one dimension, you shouldn't use two (that is, avoid using np.tile(np.arange(10), (3, 1)) as a longitude coordinate). The other interpretation is that dimension coordinates must not be included in the coordinates attribute, which seems unlikely given that three lines later it says:

Note that it is permissible, but optional, to list coordinate variables as well as auxiliary coordinate variables in the coordinates attribute.

The first paragraph of the section on Discrete sampling geometries:

Every element of every feature must be unambiguously associated with its space and time coordinates and with the feature that contains it. The coordinates attribute must be attached to every data variable to indicate the spatiotemporal coordinate variables that are needed to geo-locate the data.

I think dimension coordinates are explicit enough to count as "unambiguously associated", even without inclusion in the coordinates attribute, since they share a name with one of the dimensions of the Discrete Sampling Geometry data variables. This seems to be made explicit in the fourth paragraph:

Auxiliary coordinate variables containing the nominal and the precise positions should be listed in the relevant coordinates attributes of data variables. In orthogonal representations the nominal positions could be coordinate variables, which do not need to be listed in the coordinates attribute, rather than auxiliary coordinate variables.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Only auxiliary coordinates are listed in nc variable attribute 1154014066

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 486.369ms · About: xarray-datasette