html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1475#issuecomment-457951491,https://api.github.com/repos/pydata/xarray/issues/1475,457951491,MDEyOklzc3VlQ29tbWVudDQ1Nzk1MTQ5MQ==,1217238,2019-01-27T20:30:49Z,2019-01-27T20:30:49Z,MEMBER,"> What matters is how it will interact the indexes, i.e. can we easily select data based on cell bounds? Either way, we will need to write our own index classes for this (but this is totally doable). This will either be something xarray specific or possibly based on `pandas.Index`. `pandas.IntervalIndex` is similar, but is much more complex because it handles overlapping cells. We would prefer a `CellIndex` that does not allow for overlap.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-457907278,https://api.github.com/repos/pydata/xarray/issues/1475,457907278,MDEyOklzc3VlQ29tbWVudDQ1NzkwNzI3OA==,1197350,2019-01-27T10:49:07Z,2019-01-27T10:49:19Z,MEMBER,"> I'm not sure I understand (N,M) sized coordinates for unstructured meshes -- what is M here? The total number of cells? Some arbitrary constant indicating the maximum number of sides for a single cell? N is the number of cells. M is the number of points required to specify the cell vertices, e.g. 4 for 2D quadmesh, 3 for 2D trimesh, 8 for 3D quadmesh, etc. Regarding your options 1 or 2, I guess I'm agnostic as to how it is implemented. I recognize 2 introduces lots of complications. What matters is how it will interact the indexes, i.e. can we easily select data based on cell bounds? I will have to take some time to think about what you wrote, as it is hard for my brain... 🙃 ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-457874348,https://api.github.com/repos/pydata/xarray/issues/1475,457874348,MDEyOklzc3VlQ29tbWVudDQ1Nzg3NDM0OA==,1217238,2019-01-26T23:14:18Z,2019-01-26T23:14:18Z,MEMBER,"> Currently we distinguish between ""dimension coordinates,"" which are converted to indexes, and ""non-dimension coordinates."" The long term plan in https://github.com/pydata/xarray/issues/1603 (""Explicit indexes"") is to eliminate this distinction -- we'll simply have variables, which can be in the form of data variables or coordinates, and indexes, for look-up along any coordinate. > What if we added a new type of coordinate called ""cell coordinates""? We could accomodate either (N+1) sized coordinates for quad-mesh geometries or (N,M) sized coordinates for unstructured meshes. I understand (N+1) sized coordinates for quad-mesh geometries, where N is the number of physical dimensions. I'm not sure I understand (N,M) sized coordinates for unstructured meshes -- what is M here? The total number of cells? Some arbitrary constant indicating the maximum number of sides for a single cell? I do. Logically I see two approaches here: 1. Putting cell bounds into structured dtypes, and adding sugar to make these easier to use (as discussed in https://github.com/pydata/xarray/issues/1475#issuecomment-314844258). 2. Putting cell bounds directly into xarray's data model in some form, so we can deviate from our current rule that ""coordinates dimensions must be a subset of DataArray dimensions."" (1) feels like the safe approach (from xarray's perpsective). Maybe structured dtypes too annoying to use on a routine basis, but there also are other use cases for them that would benefit from some attention. I worry that solutions in the style of (2) would bake domain specific logic deep into xarray's data model and make the whole library more complex, though I do appreciate that cell bounds are a pretty ubiquitous concept for modeling physical phenomena. One way of solving (2) would be to allow something like ""isolated"" or ""non-aligned"" dimensions, which aren't shared across a Dataset/DataArray and are allowed to deviate on a per-variable basis. `Dataset.dims` would be a dynamic (rather than computed) part of xarray's data model, and dimensions not found in `dims` would not be required to be aligned/consistent between variables. This is intriguing but is also a much bigger change: - By default (i.e., `dims=None`), `dims` would get filled in from all the variables in a Dataset. But the aligned dimensions in `dims` could also be set explicitly. - If a dimension isn't found in `dims`, you can't index or align along it and it's allowed to vary between variables. - DataArray objects would also need some way to distinguish between ""aligned"" and ""non-aligned"" dimensions. It's less clear what this would be. - Only aligned dimensions on coordinates of a DataArray are required to be found on the DataArray variable.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-457246424,https://api.github.com/repos/pydata/xarray/issues/1475,457246424,MDEyOklzc3VlQ29tbWVudDQ1NzI0NjQyNA==,1197350,2019-01-24T15:50:24Z,2019-01-24T15:50:24Z,MEMBER,"I'm just pinging this issue again to keep it fresh. I am becoming more and more convinced that we need to allow for cell bounds in xarray's data model. Contrary to my comments above, I no longer think this is a problem to be solved with xgcm or some outside package. CF conventions, which we partially support in other parts of xarray, have a clearly defined concept of [cell geometry](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#_data_representative_of_cells). When present, such coordinates could decoded and used for indexing and plotting. Currently we distinguish between ""dimension coordinates,"" which are converted to indexes, and ""non-dimension coordinates."" What if we added a new type of coordinate called ""cell coordinates""? We could accomodate either (N+1) sized coordinates for quad-mesh geometries of (N,M) sized coordinates for unstructured meshes. What is a concrete first step we could take towards this goal? Try to work out a design document? ","{""total_count"": 6, ""+1"": 6, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-417172168,https://api.github.com/repos/pydata/xarray/issues/1475,417172168,MDEyOklzc3VlQ29tbWVudDQxNzE3MjE2OA==,1197350,2018-08-30T02:49:12Z,2018-08-30T02:49:12Z,MEMBER,"cc @adcroft, who expressed interest in this topic.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-314898489,https://api.github.com/repos/pydata/xarray/issues/1475,314898489,MDEyOklzc3VlQ29tbWVudDMxNDg5ODQ4OQ==,1197350,2017-07-12T21:12:09Z,2017-07-12T21:12:24Z,MEMBER,"These are precisely the sort of issues we are trying to solve with [xgcm](https://github.com/xgcm/xgcm). I am about to make a big new release. Using the xgcm concept of an `Axis` object (not yet in the online docs until the new release), it should be pretty easy to add this sort of plotting support in an arbitrary number of dimensions. ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-314844258,https://api.github.com/repos/pydata/xarray/issues/1475,314844258,MDEyOklzc3VlQ29tbWVudDMxNDg0NDI1OA==,1217238,2017-07-12T17:44:28Z,2017-07-12T17:44:28Z,MEMBER,"I don't think we need a full `NDIntervalIndex` unless we also want indexing, which is nice but not essential for just storing data. We do need a way to represent interval data in 1D arrays, though. Probably the simplest option is to use structured dtypes, which should already work with the existing version of xarray, e.g., ``` import numpy as np import xarray interval_dtype = np.dtype([('start', float), ('stop', float)]) coords = {'x': 0.5 + np.arange(3), 'x_bounds': ('x', np.array([(0, 1), (1, 2), (2, 3)], dtype=interval_dtype))} da = xarray.DataArray(range(3), coords=coords, dims='x') ``` ``` >>> da array([0, 1, 2]) Coordinates: * x (x) float64 0.5 1.5 2.5 x_bounds (x) [('start', '>> da.x_bounds array([(0.0, 1.0), (1.0, 2.0), (2.0, 3.0)], dtype=[('start', '>> da.x_bounds.data['start'], da.x_bounds.data['stop'] (array([ 0., 1., 2.]), array([ 1., 2., 3.])) ``` We could probably do a few things to make these easier to use: 1. Support indexing like `da.x_bounds['start']` to return `da.x_bounds.data['start']` wrapped in an `xarray.DataArray`. 2. Automatically create them as part of netCDF IO. Conceptually, this is pretty similar to a MultiIndex (see https://github.com/pydata/xarray/pull/1426 for discussion).","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620 https://github.com/pydata/xarray/issues/1475#issuecomment-314579832,https://api.github.com/repos/pydata/xarray/issues/1475,314579832,MDEyOklzc3VlQ29tbWVudDMxNDU3OTgzMg==,10050469,2017-07-11T21:38:01Z,2017-07-11T21:38:01Z,MEMBER,See also https://github.com/pydata/xarray/pull/1079 and https://github.com/pydata/xarray/pull/1079#issuecomment-258456887,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,242181620