home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 628719058 and user = 226037 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • alexamici · 3 ✖

issue 1

  • Feature Request: Hierarchical storage and processing in xarray · 3 ✖

author_association 1

  • MEMBER 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1042753800 https://github.com/pydata/xarray/issues/4118#issuecomment-1042753800 https://api.github.com/repos/pydata/xarray/issues/4118 IC_kwDOAMm_X84-JykI alexamici 226037 2022-02-17T09:41:29Z 2022-02-17T09:53:55Z MEMBER

@kmuehlbauer in the representation I use the fully qualified name for the dimension / coordinate, but the corresponding DataArray will use the basename, e.g. both array will have lat as a coordinate. Sorry for the confusion, I need to add more context to the README.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature Request: Hierarchical storage and processing in xarray 628719058
1042656377 https://github.com/pydata/xarray/issues/4118#issuecomment-1042656377 https://api.github.com/repos/pydata/xarray/issues/4118 IC_kwDOAMm_X84-Jax5 alexamici 226037 2022-02-17T07:39:15Z 2022-02-17T08:17:51Z MEMBER

@TomNicholas (cc @mraspaud)

Do you have use cases which one of these designs could handle but the other couldn't?

The two main classes of on-disk formats that, I know of, which cannot be always represented in the "group is a Dataset" approach are: - in netCDF following the CF conventions for groups, it is legal for an array to refer to a dimension or a coordinate in a different group and so arrays in the same group may have dimensions with the same name, but different size / coordinate values, (this was the orginal motivation to explore the DataGroup approach) - the current spec for the Next-generation file formats (NGFF) for bio-imaging has all scales of the same 5D data in the same group. (cc @joshmoore)

I don't have an example at hand, but my impression is that satellite products that use HDF5 file format also place arrays with inconsistent dimensions / coordinates in the same group.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature Request: Hierarchical storage and processing in xarray 628719058
1042664227 https://github.com/pydata/xarray/issues/4118#issuecomment-1042664227 https://api.github.com/repos/pydata/xarray/issues/4118 IC_kwDOAMm_X84-Jcsj alexamici 226037 2022-02-17T07:52:17Z 2022-02-17T07:53:13Z MEMBER

@TomNicholas I also have a few comments on the comparison:

  • Option (1) - Each group is a Dataset

  • Model maps more directly onto netCDF (though still not exactly, because netCDF has dimensions as separate objects)

This is only true for flat netCDF files, once you introduce groups in a netCDF AND accept CF conventions the DataGroup approach can map 100% of the files, while the DataTree approach fails on a (admittedly small) class of them.

  • Enforcing consistency between variables guarantees certain operations are always well-defined (in particular selection via an integer index like in .isel).
  • Guarantees that all valid operations on a Dataset are also valid operations on a single group of a DataTree - so API can be essentially identical to Dataset.

Both points are only true for the DataArray in a single group, once you broadcast any operation to subgroups the two implementations would share the same limitations (dimensions in subgroups can be inconsistent in both cases).

In my opinion the advantage for the DataTree is minimal.

  • Metadata (i.e. .attrs) are arguably most useful when set at this level

The two approach are identical in this respect, group attributes are mapped in the same way to DataTree and DataGroup

I share your views on all other points.

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  Feature Request: Hierarchical storage and processing in xarray 628719058

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1044.534ms · About: xarray-datasette