home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 39264845

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
39264845 MDU6SXNzdWUzOTI2NDg0NQ== 197 We need some way to identify non-index coordinates 1217238 closed 0   740776 3 2014-08-01T06:36:13Z 2014-12-19T07:16:14Z 2014-09-10T06:07:15Z MEMBER      

I am currently working with station data. In order to keep around latitude and longitude (I use station_id as the coordinate variable), I need to resort to some ridiculous contortions:

python residuals = results['y'] - observations['y'] residuals.dataset.update(results.select_vars('longitude', 'latitude'))

There has got to be an easier way to handle this.


I don't want to revert to some primitive guessing strategy (e.g, looking at attrs['coordinates']) to figure out which extra variables can be safely kept after mathematical operations.

Another approach would be to try to preserve everything in the dataset linked to an DataArray when doing math. But I don't really like this option, either, because it would lead to serious propagation of "linked dataset variables", which are rather surprising and can have unexpected performance consequences (though at least they appear in repr as of #128).


This leaves me to a final alternative: restructuring xray's internals to provide first-class support for coordinates that are not indexes. For example, this would mean promoting ds.coordinates to an actual dictionary stored on a dataset, and allowing it to hold objects that aren't an xray.Coordinate.

Making this change transparent to users would likely require changing the Dataset signature to something like Dataset(variables, coords, attrs). We might (yet again) want to rename Coordinate, to something like IndexVar, to emphasis the notion of "index" and "non-index" coordinates. And we could get rid of the terrible "linked dataset variable".

Once we have non-index coordinates, we need a policy for what to do when adding with two DataArrays for which they differ. I think my preferred approach is to not enforce that they be found on both arrays, but to raise an exception if there are any conflicting values -- unless they are scalar valued, in which case the dropped or turned into a tuple or given different names. (Otherwise there would be cases where you couldn't calculate x[1] - x[0].)

We might even able to keep around multi-dimension coordinates this way (e.g., 2D lat/lon arrays for projected data).... I'll need to think about that one some more.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/197/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 3 rows from issue in issue_comments
Powered by Datasette · Queries took 0.588ms · About: xarray-datasette