home / github / issue_comments

Menu
  • GraphQL API
  • Search all tables

issue_comments: 1324489293

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/7045#issuecomment-1324489293 https://api.github.com/repos/pydata/xarray/issues/7045 1324489293 IC_kwDOAMm_X85O8hpN 23484003 2022-11-23T03:05:50Z 2022-11-23T03:06:57Z NONE

IMO nearly all the complication and confusion emerge from the mixed concept of a dimension coordinate in the Xarray data model.

My take: the main confusion is from trying to support a relational-database-like data model (where inner/outer joins make sense because values are discrete/categorical) AND a multi-dimensional array model for physical sciences (where typically values are floating-point, exact alignment is required, and interpolation is used when alignment is inexact). As a physical sciences guy, I basically never use the database-like behavior, and it only serves to silence alignment errors so that the fallout happens downstream (NaNs from outer joins, empty arrays on inner joins), making it harder to debug. TIL I can just xarray.set_options(arithmetic_join='exact') and get what I wanted all along.

Why can't we use loc/sel with a non-dimension (non-index) coord?

What happens if I have Cartesian x/y dimensions plus r/theta cylindrical coordinates defined on the x / y, and I select some range in r? It's not slicing an array at that point, that's more like a relational database query. The thing you get back isn't an array anymore because not all i,j combinations are valid.

confusion emerge[s] from the mixed concept of a dimension coordinate

From my perspective, the dimensions are special coordinates that the arrays happen to be sampled in a rectangular grid on. It's not confusing to me, but maybe that's b/c of my perspective from physical sciences background/usecases. I suppose one could in principle have an array with coordinates such that none of the coordinates aligned with any particular axis, but it seems improbable.

What do you think of making the default FloatIndex use a reasonable (hard to define!) rtol for comparisons?

IMO this is asking for weird bugs. In my work I either expect exact alignment, or I want to interpolate. I never want to ignore a mismatch because it's basically just sweeping an error under the rug. In fact, I'd really just like to test that all the dimension coordinates are the same objects, although Python's semantics don't really work with that.

imagine cases where a coordinate is defined in separate units.

Getting this right would be really powerful.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  1376109308
Powered by Datasette · Queries took 0.726ms · About: xarray-datasette