home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

13 rows where issue = 242181620 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 7

  • rabernat 4
  • shoyer 3
  • JiaweiZhuang 2
  • fmaussion 1
  • SimonHeybrock 1
  • lukelbd 1
  • rogvidarge 1

author_association 2

  • MEMBER 8
  • NONE 5

issue 1

  • Allow DataArray to hold cell boundaries as coordinate variables · 13 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1370563894 https://github.com/pydata/xarray/issues/1475#issuecomment-1370563894 https://api.github.com/repos/pydata/xarray/issues/1475 IC_kwDOAMm_X85RsSU2 SimonHeybrock 12912489 2023-01-04T07:20:36Z 2023-01-04T07:20:36Z NONE

Recently I experimented with an (incomplete) duck-array prototype, wrapping an array of length N+1 in a duck array of length N (such that you can use it as a coordinate for a DataArray of length/shape N). It mostly worked (even though there may be some issues when you want to use it as an xarray index).

See https://github.com/scipp/scippx/blob/main/src/scippx/bin_edge_array.py (there is a bunch of unrelated stuff in the repo, you can mostly ignore that).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
1328123153 https://github.com/pydata/xarray/issues/1475#issuecomment-1328123153 https://api.github.com/repos/pydata/xarray/issues/1475 IC_kwDOAMm_X85PKY0R rogvidarge 74651486 2022-11-26T22:18:59Z 2022-11-26T22:18:59Z NONE

Has there been any progress on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
1190842459 https://github.com/pydata/xarray/issues/1475#issuecomment-1190842459 https://api.github.com/repos/pydata/xarray/issues/1475 IC_kwDOAMm_X85G-tBb lukelbd 19657652 2022-07-20T22:46:36Z 2022-07-20T22:46:36Z NONE

Not sure where this stands but another advantage might be the ability to call xr.open_dataarray on netcdf files containing individual variables plus coordinate bounds (data from CMIP5/6 are commonly stored this way).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
457951491 https://github.com/pydata/xarray/issues/1475#issuecomment-457951491 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDQ1Nzk1MTQ5MQ== shoyer 1217238 2019-01-27T20:30:49Z 2019-01-27T20:30:49Z MEMBER

What matters is how it will interact the indexes, i.e. can we easily select data based on cell bounds?

Either way, we will need to write our own index classes for this (but this is totally doable). This will either be something xarray specific or possibly based on pandas.Index.

pandas.IntervalIndex is similar, but is much more complex because it handles overlapping cells. We would prefer a CellIndex that does not allow for overlap.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
457907278 https://github.com/pydata/xarray/issues/1475#issuecomment-457907278 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDQ1NzkwNzI3OA== rabernat 1197350 2019-01-27T10:49:07Z 2019-01-27T10:49:19Z MEMBER

I'm not sure I understand (N,M) sized coordinates for unstructured meshes -- what is M here? The total number of cells? Some arbitrary constant indicating the maximum number of sides for a single cell?

N is the number of cells. M is the number of points required to specify the cell vertices, e.g. 4 for 2D quadmesh, 3 for 2D trimesh, 8 for 3D quadmesh, etc.

Regarding your options 1 or 2, I guess I'm agnostic as to how it is implemented. I recognize 2 introduces lots of complications. What matters is how it will interact the indexes, i.e. can we easily select data based on cell bounds?

I will have to take some time to think about what you wrote, as it is hard for my brain... 🙃

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
457874348 https://github.com/pydata/xarray/issues/1475#issuecomment-457874348 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDQ1Nzg3NDM0OA== shoyer 1217238 2019-01-26T23:14:18Z 2019-01-26T23:14:18Z MEMBER

Currently we distinguish between "dimension coordinates," which are converted to indexes, and "non-dimension coordinates."

The long term plan in https://github.com/pydata/xarray/issues/1603 ("Explicit indexes") is to eliminate this distinction -- we'll simply have variables, which can be in the form of data variables or coordinates, and indexes, for look-up along any coordinate.

What if we added a new type of coordinate called "cell coordinates"? We could accomodate either (N+1) sized coordinates for quad-mesh geometries or (N,M) sized coordinates for unstructured meshes.

I understand (N+1) sized coordinates for quad-mesh geometries, where N is the number of physical dimensions.

I'm not sure I understand (N,M) sized coordinates for unstructured meshes -- what is M here? The total number of cells? Some arbitrary constant indicating the maximum number of sides for a single cell?

I do.

Logically I see two approaches here: 1. Putting cell bounds into structured dtypes, and adding sugar to make these easier to use (as discussed in https://github.com/pydata/xarray/issues/1475#issuecomment-314844258). 2. Putting cell bounds directly into xarray's data model in some form, so we can deviate from our current rule that "coordinates dimensions must be a subset of DataArray dimensions."

(1) feels like the safe approach (from xarray's perpsective). Maybe structured dtypes too annoying to use on a routine basis, but there also are other use cases for them that would benefit from some attention. I worry that solutions in the style of (2) would bake domain specific logic deep into xarray's data model and make the whole library more complex, though I do appreciate that cell bounds are a pretty ubiquitous concept for modeling physical phenomena.

One way of solving (2) would be to allow something like "isolated" or "non-aligned" dimensions, which aren't shared across a Dataset/DataArray and are allowed to deviate on a per-variable basis. Dataset.dims would be a dynamic (rather than computed) part of xarray's data model, and dimensions not found in dims would not be required to be aligned/consistent between variables. This is intriguing but is also a much bigger change: - By default (i.e., dims=None), dims would get filled in from all the variables in a Dataset. But the aligned dimensions in dims could also be set explicitly. - If a dimension isn't found in dims, you can't index or align along it and it's allowed to vary between variables. - DataArray objects would also need some way to distinguish between "aligned" and "non-aligned" dimensions. It's less clear what this would be. - Only aligned dimensions on coordinates of a DataArray are required to be found on the DataArray variable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
457246424 https://github.com/pydata/xarray/issues/1475#issuecomment-457246424 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDQ1NzI0NjQyNA== rabernat 1197350 2019-01-24T15:50:24Z 2019-01-24T15:50:24Z MEMBER

I'm just pinging this issue again to keep it fresh.

I am becoming more and more convinced that we need to allow for cell bounds in xarray's data model. Contrary to my comments above, I no longer think this is a problem to be solved with xgcm or some outside package.

CF conventions, which we partially support in other parts of xarray, have a clearly defined concept of cell geometry. When present, such coordinates could decoded and used for indexing and plotting.

Currently we distinguish between "dimension coordinates," which are converted to indexes, and "non-dimension coordinates." What if we added a new type of coordinate called "cell coordinates"? We could accomodate either (N+1) sized coordinates for quad-mesh geometries of (N,M) sized coordinates for unstructured meshes.

What is a concrete first step we could take towards this goal? Try to work out a design document?

{
    "total_count": 6,
    "+1": 6,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
417172168 https://github.com/pydata/xarray/issues/1475#issuecomment-417172168 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDQxNzE3MjE2OA== rabernat 1197350 2018-08-30T02:49:12Z 2018-08-30T02:49:12Z MEMBER

cc @adcroft, who expressed interest in this topic.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314862336 https://github.com/pydata/xarray/issues/1475#issuecomment-314862336 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDg2MjMzNg== JiaweiZhuang 25473287 2017-07-12T18:50:02Z 2017-07-13T01:48:32Z NONE

Probably the simplest option is to use structured dtypes, which should already work with the existing version of xarray, e.g.,

Thanks, that's a nice trick! Supporting da.x_bounds['start'] will definitely be helpful!

However, I am still concerned about 2D boundaries. Using the structured data type, 2D bounds will be an array of size (Nx,Ny,4) instead of (Nx+1,Ny+1). Although this matches the CF convention, it takes 4x memory and needs to be converted back to (Nx+1,Ny+1) for pcolormesh(). Not a big problem though. I will be happy to go this way if (Nx+1,Ny+1)-sized bounds cannot be implemented.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314898489 https://github.com/pydata/xarray/issues/1475#issuecomment-314898489 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDg5ODQ4OQ== rabernat 1197350 2017-07-12T21:12:09Z 2017-07-12T21:12:24Z MEMBER

These are precisely the sort of issues we are trying to solve with xgcm. I am about to make a big new release. Using the xgcm concept of an Axis object (not yet in the online docs until the new release), it should be pretty easy to add this sort of plotting support in an arbitrary number of dimensions.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314844258 https://github.com/pydata/xarray/issues/1475#issuecomment-314844258 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDg0NDI1OA== shoyer 1217238 2017-07-12T17:44:28Z 2017-07-12T17:44:28Z MEMBER

I don't think we need a full NDIntervalIndex unless we also want indexing, which is nice but not essential for just storing data. We do need a way to represent interval data in 1D arrays, though.

Probably the simplest option is to use structured dtypes, which should already work with the existing version of xarray, e.g., ``` import numpy as np import xarray

interval_dtype = np.dtype([('start', float), ('stop', float)]) coords = {'x': 0.5 + np.arange(3), 'x_bounds': ('x', np.array([(0, 1), (1, 2), (2, 3)], dtype=interval_dtype))} da = xarray.DataArray(range(3), coords=coords, dims='x')

da <xarray.DataArray (x: 3)> array([0, 1, 2]) Coordinates: * x (x) float64 0.5 1.5 2.5 x_bounds (x) [('start', '<f8'), ('stop', '<f8')] (0.0, 1.0) (1.0, 2.0) ...

da.x_bounds <xarray.DataArray 'x_bounds' (x: 3)> array([(0.0, 1.0), (1.0, 2.0), (2.0, 3.0)], dtype=[('start', '<f8'), ('stop', '<f8')]) Coordinates: * x (x) float64 0.5 1.5 2.5 x_bounds (x) [('start', '<f8'), ('stop', '<f8')] (0.0, 1.0) (1.0, 2.0) ...

da.x_bounds.data['start'], da.x_bounds.data['stop'] (array([ 0., 1., 2.]), array([ 1., 2., 3.])) ```

We could probably do a few things to make these easier to use: 1. Support indexing like da.x_bounds['start'] to return da.x_bounds.data['start'] wrapped in an xarray.DataArray. 2. Automatically create them as part of netCDF IO.

Conceptually, this is pretty similar to a MultiIndex (see https://github.com/pydata/xarray/pull/1426 for discussion).

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314604740 https://github.com/pydata/xarray/issues/1475#issuecomment-314604740 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDYwNDc0MA== JiaweiZhuang 25473287 2017-07-11T23:58:20Z 2017-07-11T23:58:20Z NONE

See also #1079 and #1079 (comment)

Thanks! The idea of NDIntervalIndex mentioned at pandas-dev/pandas#7640 comment seems powerful but too complicated to implement? Could there be a simpler way to hook the boundary attribute to DataArray?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620
314579832 https://github.com/pydata/xarray/issues/1475#issuecomment-314579832 https://api.github.com/repos/pydata/xarray/issues/1475 MDEyOklzc3VlQ29tbWVudDMxNDU3OTgzMg== fmaussion 10050469 2017-07-11T21:38:01Z 2017-07-11T21:38:01Z MEMBER

See also https://github.com/pydata/xarray/pull/1079 and https://github.com/pydata/xarray/pull/1079#issuecomment-258456887

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Allow DataArray to hold cell boundaries as coordinate variables 242181620

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.226ms · About: xarray-datasette