home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1546939363

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/7840#issuecomment-1546939363 https://api.github.com/repos/pydata/xarray/issues/7840 1546939363 IC_kwDOAMm_X85cNGvj 43316012 2023-05-14T16:29:13Z 2023-05-14T16:29:13Z COLLABORATOR

I agree that there is a lot of terminology that is somewhat similar and/or overlapping.

In math you have the basis of a vector space which when plotting in a coordinate system you sometimes call coordinates or axes. Also you often use the phrase "the x-coordinate of point p is ...", So I understand why naming an axis "coordinate" might sound reasonable.

However in xarray (or numerics in general) you often deal with data that is not aligned to any axis in a given coordinate system. Consider the following example: you have a time series of points that might be scattered randomly in your coordinate system. Now you can assign to each point a x and y coordinate. In xarray you would call the dimension of the data e.g. "time". You will need two additional time series of data points for the x and y coordinates.

In xarray this would be something like that: python da = xr.DataArray( [5, 6, 7], dims="time", coords={ "x": ("time", [1, 2, 3]), "y": ("time", [9, 8, 7]), "time": [0, 1, 2] } You can see that in this example we cannot easily define a x-y coordinate system since the data points are not on a grid. But still we can assign each data point a x and y coordinate.

Now finding a naming convention that fits both, random data points and lattices that are aligned with the coordinate system is not trivial.

That's why we choose to go with the following: - dimension: name of an axis of a nd-array, this might be a "real" axis that has any real world equivalent or something as trivial as "order in which the data has been aquisited". In the example this was "time". - coordinate: auxiliary data that can be used to identify the data values. In the example this was "x" and "y". (Basically it is a short name for "coordinate variable") - dimension coordinate: a coordinate that assigns values to the dimension directly if possible. This is the case when the name of a coordinate is the same as a dimension. In the example this was "time" with the timestamps 0, 1 and 2 (probably this should be real timestamps).

Feel free to propose some changes to the documentation such that newcomers will find it easier to understand the terminology.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1707774178
Powered by Datasette · Queries took 75.765ms · About: xarray-datasette