home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 1042664227

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/4118#issuecomment-1042664227 https://api.github.com/repos/pydata/xarray/issues/4118 1042664227 IC_kwDOAMm_X84-Jcsj 226037 2022-02-17T07:52:17Z 2022-02-17T07:53:13Z MEMBER

@TomNicholas I also have a few comments on the comparison:

  • Option (1) - Each group is a Dataset

  • Model maps more directly onto netCDF (though still not exactly, because netCDF has dimensions as separate objects)

This is only true for flat netCDF files, once you introduce groups in a netCDF AND accept CF conventions the DataGroup approach can map 100% of the files, while the DataTree approach fails on a (admittedly small) class of them.

  • Enforcing consistency between variables guarantees certain operations are always well-defined (in particular selection via an integer index like in .isel).
  • Guarantees that all valid operations on a Dataset are also valid operations on a single group of a DataTree - so API can be essentially identical to Dataset.

Both points are only true for the DataArray in a single group, once you broadcast any operation to subgroups the two implementations would share the same limitations (dimensions in subgroups can be inconsistent in both cases).

In my opinion the advantage for the DataTree is minimal.

  • Metadata (i.e. .attrs) are arguably most useful when set at this level

The two approach are identical in this respect, group attributes are mapped in the same way to DataTree and DataGroup

I share your views on all other points.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  628719058
Powered by Datasette · Queries took 0.85ms · About: xarray-datasette