home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 1454832041 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • benbovy 3
  • yvogradyent 3

author_association 2

  • MEMBER 3
  • NONE 3

issue 1

  • stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1324846342 https://github.com/pydata/xarray/issues/7297#issuecomment-1324846342 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O940G yvogradyent 96822049 2022-11-23T10:32:59Z 2022-11-23T10:35:04Z NONE

There's no such thing like a "midx.x" dimension in Xarray.

I'm aware, just using it to communicate thoughts.

Perfectly fine with the solution proposed, as long as it works with unstack()?

I would then expect something like ds_stacked ```

<xarray.Dataset>

Dimensions: (x: 2, midx: 4)

Coordinates:

* midx (midx) object MultiIndex

* x (midx) int32 1 1 2 2

* y (midx) int32 3 4 3 4

Data variables:

a (x) int32 6 7

b (midx) int32 10, 20, 30, 40

c (x, midx) int32 16, 26, 37, 47

```

ds_unstacked (not sure if I'm doing the order of values right for b and c, but I hope you get the point) ```

<xarray.Dataset>

Dimensions: (x: 2, y: 2)

Coordinates:

* x (x) int32 1 2

* y (y) int32 3 4

Data variables:

a (x) int32 6 7

b (x, y) int32 10, 20, 30, 40

c (x, y) int32 16, 26, 37, 47

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1324753837 https://github.com/pydata/xarray/issues/7297#issuecomment-1324753837 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O9iOt benbovy 4160723 2022-11-23T09:17:33Z 2022-11-23T09:17:33Z MEMBER

But does this still work properly with broadcasting? For example, let's say there is another data variable b (midx) and an operation is done like ds_stacked['c'] = ds_stacked.a + ds_stacked.b. Then it should be that c (midx) and a (x) should be "repeated" to midx.x

I think it would keep things much simpler if we consider "x" and "midx" as two separate dimensions in the stacked Dataset, i.e., ds_stacked['c'] would result in a 2-d array (x, midx). There's no such thing like a "midx.x" dimension in Xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1324742207 https://github.com/pydata/xarray/issues/7297#issuecomment-1324742207 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O9fY_ yvogradyent 96822049 2022-11-23T09:08:09Z 2022-11-23T09:08:09Z NONE

Perfect, that's also the example I prefer the most. But does this still work properly with broadcasting? For example, let's say there is another data variable b (midx) and an operation is done like ds_stacked['c'] = ds_stacked.a + ds_stacked.b. Then it should be that c (midx) and a (x) should be "repeated" to midx.x

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1323849354 https://github.com/pydata/xarray/issues/7297#issuecomment-1323849354 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O6FaK benbovy 4160723 2022-11-22T15:24:53Z 2022-11-22T15:36:46Z MEMBER

The last example in your comment is probably the most meaningful one:

```

<xarray.Dataset>

Dimensions: (x: 2, midx: 4)

Coordinates:

* midx (midx) object MultiIndex

* x (midx) int32 1 1 2 2

* y (midx) int32 3 4 3 4

Data variables:

a (x) int32 6 7

```

To avoid name conflicts, we could just discard the original dimension coordinates x and y. Like here above, "x" becomes a dimension without coordinate. In that example, when unstacking we would retrieve the "x" dimension coordinate like in the original dataset.

(note: I think it is now possible to have a dimension "x" and a coordinate "x" with different dimensions, but I haven't checked).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1323820922 https://github.com/pydata/xarray/issues/7297#issuecomment-1323820922 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O5-d6 yvogradyent 96822049 2022-11-22T15:05:28Z 2022-11-22T15:06:15Z NONE

Hi @benbovy, thanks for the reply!

Made a mistake in expected, edited that part.

In terms of name conflicts, I don't know what the solution should be. It would somehow need to know that the x of the multi-index refers to the same label of the original x coordinate but is, due to stacking, just a repeated version of it?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041
1323478134 https://github.com/pydata/xarray/issues/7297#issuecomment-1323478134 https://api.github.com/repos/pydata/xarray/issues/7297 IC_kwDOAMm_X85O4qx2 benbovy 4160723 2022-11-22T10:50:01Z 2022-11-22T10:50:01Z MEMBER

Interesting! I don't think that when adding stack / unstack we were thinking that variables with only a subset of the stacked dimensions would be a common use case.

I guess it would be possible to add some option to stack only the variables that have all the dimensions to be stacked, and leave the other variables unchanged? However, one problem with keeping the original dimension coordinates is that we would have name conflicts between the single index coordinates and the multi-index coordinates.

In your expected example, the "x" coordinate is part of the multi-index but it doesn't have the same dimension "midx"? I find it rather confusing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 23.012ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows