home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 326776669

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/1473#issuecomment-326776669 https://api.github.com/repos/pydata/xarray/issues/1473 326776669 MDEyOklzc3VlQ29tbWVudDMyNjc3NjY2OQ== 1217238 2017-09-03T00:27:05Z 2017-09-03T00:27:05Z MEMBER

API question. .sel(x=[0.0, 1.0], method='nearest', tolerance=0.1) should work exactly same as .reindex(x=[0.0, 1.0], method='nearest', tolerance=0.1)?

There are two key differences between sel and reindex: - reindex inserts NaN when there is not a match whereas sel raises an error - for inexact indexing (e.g., method='nearest'), the result of reindex copies the index from the indexers, whereas the result of sel copies the index from the object being indexed

My preference is to make sel work as reindex currently does and to gradually deprecate reindex method, because now the difference between these two methods are very tiny.

I'm not sure this is desirable, because it's nice to have a way to do indexing that is guaranteed not to introduce missing values.

Currently, reindex only supports indexing with 1D arguments, and the values of those arguments are taken to be the new index coordinates. I don't know quite what it would mean to reindex with a multi-dimensional indexer -- I guess the result would gain multi-dimensional coordinate indexes? Also, when reindexing like ds.reindex(x=indexer), which coordinates take precedence on the result for x -- indexer.coords['x'] or indexer.values?

I do think there is a valid concern about consistency between sel() and reindex(). Right now, coordinates and dimensions on arguments to reindex are entirely ignored. If we are ever going to allow reindexing with multi-dimensional arguments (and broadcasting), we should consider raising an error or warning now when passed indexers with inconsistent dimensions/coordinates.

From a practical perspective, writing a version of vectorized indexing that fills in NaN could be non-trivial. To enable this under the hood, I think we would need a version of ndarray.__getitem__ that uses a sentinel value (e.g., -1) to fill in NaN instead of doing indexing. I guess this could probably be done with a combination of NumPy's advanced indexing plus a mask.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  241578773
Powered by Datasette · Queries took 0.787ms · About: xarray-datasette