home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1825198736

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1825198736 I_kwDOAMm_X85sylKQ 8026 non-deterministic ordering within "coordinates" attribute written by .to_netcdf 3383837 closed 0     2 2023-07-27T20:50:20Z 2023-08-03T16:27:29Z 2023-08-03T16:27:29Z CONTRIBUTOR      

What is your issue?

Under the assumption that deterministic output is preferred whenever feasible, I'd like to point out that the variable names written into "coordinates" attributes with .to_netcdf are not ordered deterministically. For pipelines that depend on file hashes to validate dependencies, this can be a real headache.

Consider the dataset xarray.Dataset({"x": ((), 0)}, coords={"a": 0, "b": 0}). The NetCDF file XArray writes will include either: variables: int64 x ; x:coordinates = "a b" ; int64 a ; int64 b ; or variables: int64 x ; x:coordinates = "b a" ; int64 a ; int64 b ;

My review of _encode_coordinates leads me to think the behavior results from collecting names in a set. I'd be happy to offer a PR to make the coordinates attribute deterministic. I am not aware of a CF convention regarding any ordering, but would research and follow if it exists. If not, then I would probably sort at L701 and L722.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8026/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 107.312ms · About: xarray-datasette