issue_comments
6 rows where issue = 1454832041 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index · 6 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
1324846342 | https://github.com/pydata/xarray/issues/7297#issuecomment-1324846342 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O940G | yvogradyent 96822049 | 2022-11-23T10:32:59Z | 2022-11-23T10:35:04Z | NONE |
I'm aware, just using it to communicate thoughts. Perfectly fine with the solution proposed, as long as it works with unstack()? I would then expect something like ds_stacked ``` <xarray.Dataset>Dimensions: (x: 2, midx: 4)Coordinates:* midx (midx) object MultiIndex* x (midx) int32 1 1 2 2* y (midx) int32 3 4 3 4Data variables:a (x) int32 6 7b (midx) int32 10, 20, 30, 40c (x, midx) int32 16, 26, 37, 47``` ds_unstacked (not sure if I'm doing the order of values right for b and c, but I hope you get the point) ``` <xarray.Dataset>Dimensions: (x: 2, y: 2)Coordinates:* x (x) int32 1 2* y (y) int32 3 4Data variables:a (x) int32 6 7b (x, y) int32 10, 20, 30, 40c (x, y) int32 16, 26, 37, 47``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 | |
1324753837 | https://github.com/pydata/xarray/issues/7297#issuecomment-1324753837 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O9iOt | benbovy 4160723 | 2022-11-23T09:17:33Z | 2022-11-23T09:17:33Z | MEMBER |
I think it would keep things much simpler if we consider "x" and "midx" as two separate dimensions in the stacked Dataset, i.e., ds_stacked['c'] would result in a 2-d array (x, midx). There's no such thing like a "midx.x" dimension in Xarray. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 | |
1324742207 | https://github.com/pydata/xarray/issues/7297#issuecomment-1324742207 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O9fY_ | yvogradyent 96822049 | 2022-11-23T09:08:09Z | 2022-11-23T09:08:09Z | NONE | Perfect, that's also the example I prefer the most.
But does this still work properly with broadcasting? For example, let's say there is another data variable |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 | |
1323849354 | https://github.com/pydata/xarray/issues/7297#issuecomment-1323849354 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O6FaK | benbovy 4160723 | 2022-11-22T15:24:53Z | 2022-11-22T15:36:46Z | MEMBER | The last example in your comment is probably the most meaningful one: ``` <xarray.Dataset>Dimensions: (x: 2, midx: 4)Coordinates:* midx (midx) object MultiIndex* x (midx) int32 1 1 2 2* y (midx) int32 3 4 3 4Data variables:a (x) int32 6 7``` To avoid name conflicts, we could just discard the original dimension coordinates x and y. Like here above, "x" becomes a dimension without coordinate. In that example, when unstacking we would retrieve the "x" dimension coordinate like in the original dataset. (note: I think it is now possible to have a dimension "x" and a coordinate "x" with different dimensions, but I haven't checked). |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 | |
1323820922 | https://github.com/pydata/xarray/issues/7297#issuecomment-1323820922 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O5-d6 | yvogradyent 96822049 | 2022-11-22T15:05:28Z | 2022-11-22T15:06:15Z | NONE | Hi @benbovy, thanks for the reply! Made a mistake in expected, edited that part. In terms of name conflicts, I don't know what the solution should be. It would somehow need to know that the x of the multi-index refers to the same label of the original x coordinate but is, due to stacking, just a repeated version of it? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 | |
1323478134 | https://github.com/pydata/xarray/issues/7297#issuecomment-1323478134 | https://api.github.com/repos/pydata/xarray/issues/7297 | IC_kwDOAMm_X85O4qx2 | benbovy 4160723 | 2022-11-22T10:50:01Z | 2022-11-22T10:50:01Z | MEMBER | Interesting! I don't think that when adding stack / unstack we were thinking that variables with only a subset of the stacked dimensions would be a common use case. I guess it would be possible to add some option to stack only the variables that have all the dimensions to be stacked, and leave the other variables unchanged? However, one problem with keeping the original dimension coordinates is that we would have name conflicts between the single index coordinates and the multi-index coordinates. In your expected example, the "x" coordinate is part of the multi-index but it doesn't have the same dimension "midx"? I find it rather confusing. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
stack().unstack() not the same as original for datavars dependent on single coordinate of multi_index 1454832041 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 2