html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1961#issuecomment-1250738238,https://api.github.com/repos/pydata/xarray/issues/1961,1250738238,IC_kwDOAMm_X85KjMA-,4160723,2022-09-19T08:47:44Z,2022-09-19T08:47:44Z,MEMBER,I think we can close this issue. The flexible index refactor now provides a nice framework for the suggestions made here.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370274074,https://api.github.com/repos/pydata/xarray/issues/1961,370274074,MDEyOklzc3VlQ29tbWVudDM3MDI3NDA3NA==,4160723,2018-03-04T23:20:55Z,2018-03-04T23:20:55Z,MEMBER,"> It is just that the name ""Index"" feels a bit wrong to me in this case, and also that xgcm.Axis (and potentially other wrappers) can do things very different than Index classes, which may be confusing. That said, as real indexes cover most of the use cases, I'd by fine if we keep calling these `indexes`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370273853,https://api.github.com/repos/pydata/xarray/issues/1961,370273853,MDEyOklzc3VlQ29tbWVudDM3MDI3Mzg1Mw==,4160723,2018-03-04T23:17:56Z,2018-03-04T23:17:56Z,MEMBER,"> Letting third-party libraries add their own repr categories seems like possibly going too far. Yes you're probably right. I can imagine in the example above that `Dataset.xgcm.grid_axes` returns a subset of a flat collection, for convenience. It is just that the name ""Index"" feels a bit wrong to me in this case, and also that `xgcm.Axis` (and potentially other wrappers) can do things very different than Index classes, which may be confusing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370273091,https://api.github.com/repos/pydata/xarray/issues/1961,370273091,MDEyOklzc3VlQ29tbWVudDM3MDI3MzA5MQ==,1217238,2018-03-04T23:06:59Z,2018-03-04T23:06:59Z,MEMBER,"> Except here where, instead of a flat collection of coordinate wrappers, I was rather thinking about a 1-level nested collection that separates them depending on what they implement. Indexes would represent one of these sub-collections. This seems messier to me. I would rather stick with adding a single OrderedDict to the data model for `Dataset` and `DataArray`. Would it be that confusing to see an xgcm grid or xarray-simlab clock listed as in the repr as an ""Index""? Letting third-party libraries add their own repr categories seems like possibly going too far.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370271596,https://api.github.com/repos/pydata/xarray/issues/1961,370271596,MDEyOklzc3VlQ29tbWVudDM3MDI3MTU5Ng==,1217238,2018-03-04T22:47:22Z,2018-03-04T23:02:52Z,MEMBER,"I guess the common pattern for ""coordinate wrappers""/""indexes"" looks like: - They are derived from/associated with one or more coordinate variables. - Operations that preserve associated coordinates should also preserve coordinate wrappers. Conversely, operations that drop any associated coordinates should drop coordinate wrappers. - If associated coordinates are subset, coordinate wrappers can be lazily updated (in the worst case from scratch). - Serialization to disk netCDF entails losing coordinate wrappers, which will need to be recreated. - Coordinate wrappers *may* implement indexing for one or more coordinates. Possible future features for coordinate wrappers: - A protocol for saving metadata to netCDF files to allow them to be automatically recreated when loading a file from disk. - Implementations for other indexing based operations, e.g., resampling or interpolation. I'm open to other names, but my inclination would be to still call all of these `indexes`, even if they don't actually implement indexing.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370272586,https://api.github.com/repos/pydata/xarray/issues/1961,370272586,MDEyOklzc3VlQ29tbWVudDM3MDI3MjU4Ng==,4160723,2018-03-04T23:00:16Z,2018-03-04T23:00:16Z,MEMBER,"Agreed with all your points @shoyer. > I'm open to other names, but my inclination would be to still call all of these indexes, even if they don't actually implement indexing. Except here where, instead of a flat collection of coordinate wrappers, I was rather thinking about a 1-level nested collection that separates them depending on what they implement. Indexes would represent one of these sub-collections.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370271642,https://api.github.com/repos/pydata/xarray/issues/1961,370271642,MDEyOklzc3VlQ29tbWVudDM3MDI3MTY0Mg==,4160723,2018-03-04T22:47:54Z,2018-03-04T22:47:54Z,MEMBER,"I don't have a full idea yet of what would be the interface, but taking the `repr()` in [your comment](https://github.com/pydata/xarray/issues/1603#issuecomment-334041813) and mixing it with a a simplified version of an example of `repr(xgcm.Grid)` found in the docs, this could look like ``` Coordinates: * experiment (exp_time) int64 0 0 0 1 1 * time (exp_time) float64 0.0 0.1 0.2 0.0 0.15 * x_g (x_g) float64 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 * x_c (x_c) int64 1 2 3 4 5 6 7 8 9 Indexes: exp_time: pandas.MultiIndex[experiment, time] Grid axes: X: xgcm.Axis[x_c, x_g] ``` Like `Dataset.indexes` returns all `Index` objects, `Dataset.xgcm.grid_axes` would return all `xgcm.Axis` objects. Like `Dataset.sel` or `Dataset.set_index` use/act on indexes, `Dataset.xgcm.interp` or `Dataset.xgcm.generate_grid` would use/act on grid axes. 3rd-party coordinate wrappers thus make sense only if there is accessors to handle them. If we add an `indexes` argument in Dataset and DataArray constructors, we might even think adding `**kwargs` as well in the constructors for, e.g., `grid_axes`. But I can see it is something that we probably don't want :-). I use `xgcm` here because I think it is a nice example of application. This might co-exist with other pairs of custom coordinate wrappers / accessors. More generally, on the xarray side we would need - a container (e.g., a dictionary) attached to `Dataset` or `DataArray` objects so that we can bind coordinate wrappers to them. - ensure that these are propagated correctly to new data objects. - maybe an `AbstractCoordinateWrapper` class that would provide a unified interface for dealing with issues of serialization, etc. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370248564,https://api.github.com/repos/pydata/xarray/issues/1961,370248564,MDEyOklzc3VlQ29tbWVudDM3MDI0ODU2NA==,1217238,2018-03-04T17:48:29Z,2018-03-04T17:48:29Z,MEMBER,"This has some similarity to what we would need for a `KDTreeIndex` (e.g., as discussed in https://github.com/pydata/xarray/issues/1603). If we can use the same interface for both, then it would be natural to support other ""derived indexes"", too. What would the proposed interface be here?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805 https://github.com/pydata/xarray/issues/1961#issuecomment-370221802,https://api.github.com/repos/pydata/xarray/issues/1961,370221802,MDEyOklzc3VlQ29tbWVudDM3MDIyMTgwMg==,4160723,2018-03-04T11:32:23Z,2018-03-04T14:12:06Z,MEMBER,"As an example, in `xgcm` we would have something like ```python >>> ds = ds_original.xgcm.generate(...) >>> ds.xgcm.interp(‘var’, axis=‘X’) ``` instead of ```python >>> ds = xgcm.generate_grid_ds(ds_original, ...) >>> grid = xgcm.Grid(ds) >>> grid.interp(ds.var, axis=‘X’) ``` The advantage in the first example is that the information on the grid’s physical axes is bound to a `Dataset` object (as coordinate wrappers), so we don’t need to deal with any instance of another class (i.e., `Grid` in the second example) to perform grid operations like interpolation on a given axis, which can rather be implemented into a Dataset accessor (i.e., `Dataset.xgcm` in the first example). @rabernat I don't have much experience with `xgcm` so maybe this isn't a good example? I guess we could just use Dataset attributes and/or private instance attributes in the Dataset accessor class for that, but - coordinate attributes are not really made for storing complex information - attributes in the accessor class are lost when creating a new Dataset - important information like grid axes should be exposed to the user ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,302077805