home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 1318992926 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 5

  • benbovy 4
  • dcherian 2
  • mschrimpf 1
  • emmaai 1
  • FabianHofmann 1

author_association 3

  • MEMBER 6
  • NONE 2
  • CONTRIBUTOR 1

issue 1

  • groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1506444868 https://github.com/pydata/xarray/issues/6836#issuecomment-1506444868 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85ZyoZE emmaai 5643062 2023-04-13T06:53:56Z 2023-04-13T06:53:56Z NONE

I solved it temporarily by reset_index to groupby and set_xindex after, if anyone is looking.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1504975778 https://github.com/pydata/xarray/issues/6836#issuecomment-1504975778 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85ZtBui benbovy 4160723 2023-04-12T09:42:39Z 2023-04-12T09:42:39Z MEMBER

A special-case sounds reasonable to me as well as a temporary fix before looking into if/how we can refactor groupby so that it works with multiple kinds of built-in and/or custom indexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1498119367 https://github.com/pydata/xarray/issues/6836#issuecomment-1498119367 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85ZS3zH dcherian 2448579 2023-04-05T20:35:20Z 2023-04-05T20:35:20Z MEMBER

I think we could special-case extracting a multiindex level here: https://github.com/pydata/xarray/blob/d4db16699f30ad1dc3e6861601247abf4ac96567/xarray/core/groupby.py#L469

group at that stage should have values ['a', 'a', 'b', 'b', 'c', 'c']

@mschrimpf Can you try that and send in a PR?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1497928848 https://github.com/pydata/xarray/issues/6836#issuecomment-1497928848 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85ZSJSQ mschrimpf 5308236 2023-04-05T18:23:51Z 2023-04-05T18:23:51Z NONE

Is there hope for groupby working on multi-indexed DataArrays again in the future? We -- and from the issue history it looks like others too -- are currently pinning xarray<2022.6 even though we would love to use newer versions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1313748084 https://github.com/pydata/xarray/issues/6836#issuecomment-1313748084 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85OTjR0 benbovy 4160723 2022-11-14T13:55:02Z 2022-11-14T13:55:02Z MEMBER

we can fix that in safe_cast_to_index()

...we cannot fix that in safe_cast_to_index() (or we can add a parameter to specify the desired result).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1313739883 https://github.com/pydata/xarray/issues/6836#issuecomment-1313739883 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85OThRr benbovy 4160723 2022-11-14T13:49:47Z 2022-11-14T13:49:47Z MEMBER

From #7282 it looks like we need to convert the multi-index level to a single index when casting the group to an index. And from #7105 we can fix that in safe_cast_to_index() (sometimes the full multi-index is expected) so we probably need a special case in groupby.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1255125171 https://github.com/pydata/xarray/issues/6836#issuecomment-1255125171 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85Kz7Cz benbovy 4160723 2022-09-22T14:39:24Z 2022-09-22T14:39:24Z MEMBER

Thanks @emmaai for the issue report and thanks @dcherian and @FabianHofmann for tracking it down.

There is a lot of complexity related to pandas.MultiIndex special cases and it's been difficult to avoid new issues arising during the index refactor.

create_default_index_implicit has some hacks to create xarray objects directly from pandas.MultiIndex instances (e.g., xr.Dataset(coords={"x": pd_midx})) or even from xarray objects wrapping multi-indexes. The error raised here suggests that the issue should fixed before this call... Probably in safe_cast_to_index indeed.

We should probably avoid using .to_index() internally, or should we even deprecate it? The fact that mda.one.to_index() (in v2022.3.0) doesn't return the same result than mda.indexes["one"] adds more confusion than it adds value. Actually, in the long-term I'd be for deprecating all pandas.MultiIndex special cases in Xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1254926860 https://github.com/pydata/xarray/issues/6836#issuecomment-1254926860 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85KzKoM FabianHofmann 19226431 2022-09-22T12:03:52Z 2022-09-22T12:03:52Z CONTRIBUTOR

After trying to dig down further into the code, I saw that grouping over levels seems to be broken generally (up-to-date main branch at time of writing), i.e.

```python import pandas as pd import numpy as np import xarray as xr

midx = pd.MultiIndex.from_product([list("abc"), [0, 1]], names=("one", "two")) mda = xr.DataArray(np.random.rand(6, 3), [("x", midx), ("y", range(3))]) mda.groupby("one").sum() ```

raises: ```python

File ".../xarray/xarray/core/_reductions.py", line 5055, in sum return self.reduce(

File ".../xarray/xarray/core/groupby.py", line 1191, in reduce return self.map(reduce_array, shortcut=shortcut)

File ".../xarray/xarray/core/groupby.py", line 1095, in map return self._combine(applied, shortcut=shortcut)

File ".../xarray/xarray/core/groupby.py", line 1127, in _combine index, index_vars = create_default_index_implicit(coord)

File ".../xarray/xarray/core/indexes.py", line 974, in create_default_index_implicit index = PandasMultiIndex(array, name)

File ".../xarray/xarray/core/indexes.py", line 552, in init raise ValueError(

ValueError: conflicting multi-index level name 'one' with dimension 'one' ` in the functioncreate_default_index_implicit``. I am still a bit puzzled how to approach this. Any idea @benbovy?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1212214557 https://github.com/pydata/xarray/issues/6836#issuecomment-1212214557 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85IQO0d dcherian 2448579 2022-08-11T16:28:10Z 2022-08-11T16:36:35Z MEMBER

@benbovy I tracked this down to

``` python

mda.one.to_index()

v2022.06.0

MultiIndex([('a', 0), ('a', 1), ('b', 0), ('b', 1), ('c', 0), ('c', 1)], names=['one', 'two'])

v2022.03.0

Index(['a', 'a', 'b', 'b', 'c', 'c'], dtype='object', name='x') ```

We call to_index here in safe_cast_to_index: https://github.com/pydata/xarray/blob/f8fee902360f2330ab8c002d54480d357365c172/xarray/core/utils.py#L115-L140

Not sure if the fix should be only in the GroupBy specifically or more generally in safe_cast_to_index

The GroupBy context is https://github.com/pydata/xarray/blob/f8fee902360f2330ab8c002d54480d357365c172/xarray/core/groupby.py#L434

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 637.174ms · About: xarray-datasette