home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "CONTRIBUTOR" and issue = 490476815 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • spencerahill 5

issue 1

  • GroupBy of stacked dim with strings renames underlying dims · 5 ✖

author_association 1

  • CONTRIBUTOR · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
604430213 https://github.com/pydata/xarray/issues/3287#issuecomment-604430213 https://api.github.com/repos/pydata/xarray/issues/3287 MDEyOklzc3VlQ29tbWVudDYwNDQzMDIxMw== spencerahill 6200806 2020-03-26T13:26:59Z 2020-03-26T13:26:59Z CONTRIBUTOR

Thanks @max-sixty. Contrary to my warning about not doing a PR, I couldn't help myself and dug in a bit. It turns out that string coordinates aren't the problem, it's when the coordinate isn't in sorted order. For example, @chrisroat's original example doesn't error if the coordinate is ["G", "R"] instead of ["R", "G"]. A more concrete WIP test:

```python def test_stack_groupby_unsorted_coord(): data = [[0, 1], [2, 3]] data_flat = [0, 1, 2, 3] dims = ["y", "x"] y_vals = [2, 3]

# "y" coord is in sorted order, and everything works
arr = xr.DataArray(data, dims=dims, coords={"y": y_vals})
actual1 = arr.stack(z=["y", "x"]).groupby("z").first()
midx = pd.MultiIndex.from_product([[2, 3], [0, 1]], names=dims)
expected1 = xr.DataArray(data_flat, dims=["z"], coords={"z": midx})
xr.testing.assert_equal(actual1, expected1)

# Now "y" coord is NOT in sorted order, and the bug appears
arr = xr.DataArray(data, dims=dims, coords={"y": y_vals[::-1]})
actual2 = arr.stack(z=["y", "x"]).groupby("z").first()
midx = pd.MultiIndex.from_product([[3, 2], [0, 1]], names=dims)
expected2 = xr.DataArray(data_flat, dims=["z"], coords={"z": midx})
xr.testing.assert_equal(actual2, expected2)

test_stack_groupby_str_coords() yieldspython


AssertionError Traceback (most recent call last)

[...]

AssertionError: Left and right DataArray objects are not equal

Differing values: L array([2, 3, 0, 1]) R array([0, 1, 2, 3]) Differing coordinates: L * z (z) MultiIndex - z_leve...(z) int64 2 2 3 3 - z_leve...(z) int64 0 1 0 1 R * z (z) MultiIndex - y (z) int64 3 3 2 2 - x (z) int64 0 1 0 1 ```

I'll return to this tomorrow, in the meantime if this triggers any thoughts about the best path forward, that would be much appreciated!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  GroupBy of stacked dim with strings renames underlying dims 490476815
603879881 https://github.com/pydata/xarray/issues/3287#issuecomment-603879881 https://api.github.com/repos/pydata/xarray/issues/3287 MDEyOklzc3VlQ29tbWVudDYwMzg3OTg4MQ== spencerahill 6200806 2020-03-25T14:44:15Z 2020-03-25T14:46:14Z CONTRIBUTOR

Notice that the string coordinate also gets reordered alphabetically: in @chrisroat 's example above, the coord goes from ['R', 'G'] to ['G', 'R'].

@max-sixty I can't promise a PR anytime soon, but if/when I do manage, where would be a good starting point? Perhaps here where the _level_ names are introduced: https://github.com/pydata/xarray/blob/009aa66620b3437cf0de675013fa7d1ff231963c/xarray/core/dataset.py#L251-L256

Edit: actually maybe here: https://github.com/pydata/xarray/blob/9eec56c833da6dca02c3e6c593586fd201a534a0/xarray/core/variable.py#L2237-L2249

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  GroupBy of stacked dim with strings renames underlying dims 490476815
603490643 https://github.com/pydata/xarray/issues/3287#issuecomment-603490643 https://api.github.com/repos/pydata/xarray/issues/3287 MDEyOklzc3VlQ29tbWVudDYwMzQ5MDY0Mw== spencerahill 6200806 2020-03-24T20:34:58Z 2020-03-24T20:34:58Z CONTRIBUTOR

Here's a quick and dirty workaround that works at least for my use case. arr_orig is the original DataArray from which arr_unstacked_bad was generated via a stack/groupby/apply/unstack chain yielding the _level_0 etc. dims, with the stack call having been arr_orig.stack(**{dim_of_stack: dims_stacked}). Likely excessively convoluted and YMMV.

```python def fix_unstacked_dims(arr_unstacked_bad, arr_orig, dim_of_stack, dims_stacked): """Workaround for xarray bug involving stacking str-based coords.

C.f. https://github.com/pydata/xarray/issues/3287

"""
dims_not_stacked = [dim for dim in arr_orig.dims if dim not in dims_stacked]
stacked_dims_after_unstack = [dim for dim in arr_unstacked_bad.dims 
                              if dim not in dims_not_stacked]
dims_mapping = {d1: d2 for d1, d2 in zip(stacked_dims_after_unstack, dims_stacked)}
arr_unstacked_bad = arr_unstacked_bad.rename(dims_mapping)

arr_out = arr_orig.copy(deep=True)
arr_out.values = arr_unstacked_bad.transpose(*arr_orig.dims).values
return arr_out.assign_coords(arr_orig.coords)

```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  GroupBy of stacked dim with strings renames underlying dims 490476815
603469718 https://github.com/pydata/xarray/issues/3287#issuecomment-603469718 https://api.github.com/repos/pydata/xarray/issues/3287 MDEyOklzc3VlQ29tbWVudDYwMzQ2OTcxOA== spencerahill 6200806 2020-03-24T19:48:57Z 2020-03-24T19:48:57Z CONTRIBUTOR

Same or different problem as https://github.com/pydata/xarray/issues/1483?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  GroupBy of stacked dim with strings renames underlying dims 490476815
603468912 https://github.com/pydata/xarray/issues/3287#issuecomment-603468912 https://api.github.com/repos/pydata/xarray/issues/3287 MDEyOklzc3VlQ29tbWVudDYwMzQ2ODkxMg== spencerahill 6200806 2020-03-24T19:47:16Z 2020-03-24T19:47:16Z CONTRIBUTOR

I just bumped into this problem as well. xarray 0.15.0. Expected behavior? Bug?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  GroupBy of stacked dim with strings renames underlying dims 490476815

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.085ms · About: xarray-datasette