home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

10 rows where issue = 302077805 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • benbovy 6
  • shoyer 3
  • stale[bot] 1

author_association 2

  • MEMBER 9
  • NONE 1

issue 1

  • Extend xarray with custom "coordinate wrappers" · 10 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1250738238 https://github.com/pydata/xarray/issues/1961#issuecomment-1250738238 https://api.github.com/repos/pydata/xarray/issues/1961 IC_kwDOAMm_X85KjMA- benbovy 4160723 2022-09-19T08:47:44Z 2022-09-19T08:47:44Z MEMBER

I think we can close this issue. The flexible index refactor now provides a nice framework for the suggestions made here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
581190919 https://github.com/pydata/xarray/issues/1961#issuecomment-581190919 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDU4MTE5MDkxOQ== stale[bot] 26384082 2020-02-02T23:41:53Z 2020-02-02T23:41:53Z NONE

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370274074 https://github.com/pydata/xarray/issues/1961#issuecomment-370274074 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3NDA3NA== benbovy 4160723 2018-03-04T23:20:55Z 2018-03-04T23:20:55Z MEMBER

It is just that the name "Index" feels a bit wrong to me in this case, and also that xgcm.Axis (and potentially other wrappers) can do things very different than Index classes, which may be confusing.

That said, as real indexes cover most of the use cases, I'd by fine if we keep calling these indexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370273853 https://github.com/pydata/xarray/issues/1961#issuecomment-370273853 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3Mzg1Mw== benbovy 4160723 2018-03-04T23:17:56Z 2018-03-04T23:17:56Z MEMBER

Letting third-party libraries add their own repr categories seems like possibly going too far.

Yes you're probably right.

I can imagine in the example above that Dataset.xgcm.grid_axes returns a subset of a flat collection, for convenience.

It is just that the name "Index" feels a bit wrong to me in this case, and also that xgcm.Axis (and potentially other wrappers) can do things very different than Index classes, which may be confusing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370273091 https://github.com/pydata/xarray/issues/1961#issuecomment-370273091 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3MzA5MQ== shoyer 1217238 2018-03-04T23:06:59Z 2018-03-04T23:06:59Z MEMBER

Except here where, instead of a flat collection of coordinate wrappers, I was rather thinking about a 1-level nested collection that separates them depending on what they implement. Indexes would represent one of these sub-collections.

This seems messier to me. I would rather stick with adding a single OrderedDict to the data model for Dataset and DataArray.

Would it be that confusing to see an xgcm grid or xarray-simlab clock listed as in the repr as an "Index"? Letting third-party libraries add their own repr categories seems like possibly going too far.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370271596 https://github.com/pydata/xarray/issues/1961#issuecomment-370271596 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3MTU5Ng== shoyer 1217238 2018-03-04T22:47:22Z 2018-03-04T23:02:52Z MEMBER

I guess the common pattern for "coordinate wrappers"/"indexes" looks like: - They are derived from/associated with one or more coordinate variables. - Operations that preserve associated coordinates should also preserve coordinate wrappers. Conversely, operations that drop any associated coordinates should drop coordinate wrappers. - If associated coordinates are subset, coordinate wrappers can be lazily updated (in the worst case from scratch). - Serialization to disk netCDF entails losing coordinate wrappers, which will need to be recreated. - Coordinate wrappers may implement indexing for one or more coordinates.

Possible future features for coordinate wrappers: - A protocol for saving metadata to netCDF files to allow them to be automatically recreated when loading a file from disk. - Implementations for other indexing based operations, e.g., resampling or interpolation.

I'm open to other names, but my inclination would be to still call all of these indexes, even if they don't actually implement indexing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370272586 https://github.com/pydata/xarray/issues/1961#issuecomment-370272586 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3MjU4Ng== benbovy 4160723 2018-03-04T23:00:16Z 2018-03-04T23:00:16Z MEMBER

Agreed with all your points @shoyer.

I'm open to other names, but my inclination would be to still call all of these indexes, even if they don't actually implement indexing.

Except here where, instead of a flat collection of coordinate wrappers, I was rather thinking about a 1-level nested collection that separates them depending on what they implement. Indexes would represent one of these sub-collections.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370271642 https://github.com/pydata/xarray/issues/1961#issuecomment-370271642 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI3MTY0Mg== benbovy 4160723 2018-03-04T22:47:54Z 2018-03-04T22:47:54Z MEMBER

I don't have a full idea yet of what would be the interface, but taking the repr() in your comment and mixing it with a a simplified version of an example of repr(xgcm.Grid) found in the docs, this could look like

<xarray.Dataset (exp_time: 5, x_c: 9, x_g: 9)> Coordinates: * experiment (exp_time) int64 0 0 0 1 1 * time (exp_time) float64 0.0 0.1 0.2 0.0 0.15 * x_g (x_g) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 * x_c (x_c) int64 1 2 3 4 5 6 7 8 9 Indexes: exp_time: pandas.MultiIndex[experiment, time] Grid axes: X: xgcm.Axis[x_c, x_g]

Like Dataset.indexes returns all Index objects, Dataset.xgcm.grid_axes would return all xgcm.Axis objects.

Like Dataset.sel or Dataset.set_index use/act on indexes, Dataset.xgcm.interp or Dataset.xgcm.generate_grid would use/act on grid axes.

3rd-party coordinate wrappers thus make sense only if there is accessors to handle them.

If we add an indexes argument in Dataset and DataArray constructors, we might even think adding **kwargs as well in the constructors for, e.g., grid_axes. But I can see it is something that we probably don't want :-).

I use xgcm here because I think it is a nice example of application. This might co-exist with other pairs of custom coordinate wrappers / accessors.

More generally, on the xarray side we would need

  • a container (e.g., a dictionary) attached to Dataset or DataArray objects so that we can bind coordinate wrappers to them.
  • ensure that these are propagated correctly to new data objects.
  • maybe an AbstractCoordinateWrapper class that would provide a unified interface for dealing with issues of serialization, etc.
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370248564 https://github.com/pydata/xarray/issues/1961#issuecomment-370248564 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDI0ODU2NA== shoyer 1217238 2018-03-04T17:48:29Z 2018-03-04T17:48:29Z MEMBER

This has some similarity to what we would need for a KDTreeIndex (e.g., as discussed in https://github.com/pydata/xarray/issues/1603). If we can use the same interface for both, then it would be natural to support other "derived indexes", too.

What would the proposed interface be here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805
370221802 https://github.com/pydata/xarray/issues/1961#issuecomment-370221802 https://api.github.com/repos/pydata/xarray/issues/1961 MDEyOklzc3VlQ29tbWVudDM3MDIyMTgwMg== benbovy 4160723 2018-03-04T11:32:23Z 2018-03-04T14:12:06Z MEMBER

As an example, in xgcm we would have something like

```python

ds = ds_original.xgcm.generate(...) ds.xgcm.interp(‘var’, axis=‘X’) ```

instead of

```python

ds = xgcm.generate_grid_ds(ds_original, ...) grid = xgcm.Grid(ds) grid.interp(ds.var, axis=‘X’) ```

The advantage in the first example is that the information on the grid’s physical axes is bound to a Dataset object (as coordinate wrappers), so we don’t need to deal with any instance of another class (i.e., Grid in the second example) to perform grid operations like interpolation on a given axis, which can rather be implemented into a Dataset accessor (i.e., Dataset.xgcm in the first example).

@rabernat I don't have much experience with xgcm so maybe this isn't a good example?

I guess we could just use Dataset attributes and/or private instance attributes in the Dataset accessor class for that, but

  • coordinate attributes are not really made for storing complex information
  • attributes in the accessor class are lost when creating a new Dataset
  • important information like grid axes should be exposed to the user
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Extend xarray with custom "coordinate wrappers" 302077805

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.336ms · About: xarray-datasette