home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 979316661 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • benbovy 2
  • shoyer 1

issue 1

  • Flexible indexes: how to handle possible dimension vs. coordinate name conflicts? · 3 ✖

author_association 1

  • MEMBER 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
906792475 https://github.com/pydata/xarray/issues/5738#issuecomment-906792475 https://api.github.com/repos/pydata/xarray/issues/5738 IC_kwDOAMm_X842DI4b shoyer 1217238 2021-08-26T22:46:19Z 2021-08-26T22:46:19Z MEMBER

In the long term, I think we want to eliminate any requirements in Xarray's data model about what variables names are OK.

In particular: 1. We want to allow a coordinate named "foo" with different dimensions than ("foo",), in order to support reading all possible netCDF files into xarray. 2. We want to allow indexing with coordinates that are not dimensions.

With this second change in particular, it should not be a problem to have a multi-index level with the same name as a dimension.

It is true that this change will introduce an inconsistency between appropriate keys for use with sel (indexed coordinates) and isel (dimensions). But I think this is basically inevitable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Flexible indexes: how to handle possible dimension vs. coordinate name conflicts? 979316661
906201836 https://github.com/pydata/xarray/issues/5738#issuecomment-906201836 https://api.github.com/repos/pydata/xarray/issues/5738 IC_kwDOAMm_X842A4rs benbovy 4160723 2021-08-26T08:25:47Z 2021-08-26T08:25:47Z MEMBER

My initial thoughts was to opt for D because it is easier to maintain (less special cases, less complexity, internally we could get rid of assert_unique_multiindex_level_names after dropping multi-index virtual variables) and also because an error is raised when there's actually an issue, e.g., we could do stack and never plan to do unstack later so in that case multi-index level vs. dimension name conflict is not an issue?

I was thinking about this kind of behavior (even though reverting previous changes is not ideal):

```python ds = xr.Dataset(coords={'dim0': ['a', 'b'], 'dim1': [0, 1]}) ds = ds.stack(dim_stacked=['dim0', 'dim1'])

This works: dim0 is not a dimension in b

ds['c'] = (('dim0',), [10, 11, 12, 13, 14, 15])

raise a nice error message here: conflicting sizes for dimension 'dim0'

ds.unstack(dim_stacked)

raise a nice error message here: conflicting sizes for dimension 'dim0'

ds.sel(dim1=0) ```

However, this may be confusing in the case of integer-based vs. label based indexing:

```python

label-based selection along dim_stacked dimension (length=4)

ds.sel(dim0='a')

integer-based selection along dim0 dimension (length=6)

ds.isel(dim0=0) ```

So a more general rule like option C is not that silly after all?

If a dimension name matches the name of a coordinate:

  • Ok if this coordinate is a "dimension" coordinate (coordinate name == dimension name)
  • (maybe ok if this this coordinate is not indexed?)
  • Error otherwise
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Flexible indexes: how to handle possible dimension vs. coordinate name conflicts? 979316661
906159715 https://github.com/pydata/xarray/issues/5738#issuecomment-906159715 https://api.github.com/repos/pydata/xarray/issues/5738 IC_kwDOAMm_X842AuZj benbovy 4160723 2021-08-26T07:21:07Z 2021-08-26T07:21:07Z MEMBER

@fujiisoup @shoyer @aseyboldt

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Flexible indexes: how to handle possible dimension vs. coordinate name conflicts? 979316661

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.596ms · About: xarray-datasette