home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

11 rows where user = 5308236 sorted by updated_at descending

✖
✖

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, reactions, created_at (date), updated_at (date)

issue 6

  • single coordinate is overwritten with dimension by set_index 4
  • Efficient workaround to group by multiple dimensions 2
  • DataArray.sel extremely slow 2
  • TypeError on DataArray.stack() if any of the dimensions to be stacked has a MultiIndex 1
  • groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1
  • groupby and mean on a MultiIndex level raises ValueError 1

user 1

  • mschrimpf · 11 ✖

author_association 1

  • NONE 11
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1497928848 https://github.com/pydata/xarray/issues/6836#issuecomment-1497928848 https://api.github.com/repos/pydata/xarray/issues/6836 IC_kwDOAMm_X85ZSJSQ mschrimpf 5308236 2023-04-05T18:23:51Z 2023-04-05T18:23:51Z NONE

Is there hope for groupby working on multi-indexed DataArrays again in the future? We -- and from the issue history it looks like others too -- are currently pinning xarray<2022.6 even though we would love to use newer versions.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby(multi-index level) not working correctly on a multi-indexed DataArray or DataSet 1318992926
1312207609 https://github.com/pydata/xarray/issues/7282#issuecomment-1312207609 https://api.github.com/repos/pydata/xarray/issues/7282 IC_kwDOAMm_X85ONrL5 mschrimpf 5308236 2022-11-11T21:32:42Z 2022-11-11T21:32:42Z NONE

This error occurs in versions 2022.6 through 2022.11. The code worked fine in previous versions, e.g. in 2022.3.0 it executes as expected. Related: https://github.com/pydata/xarray/issues/6836.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  groupby and mean on a MultiIndex level raises ValueError 1445905299
720612094 https://github.com/pydata/xarray/issues/2537#issuecomment-720612094 https://api.github.com/repos/pydata/xarray/issues/2537 MDEyOklzc3VlQ29tbWVudDcyMDYxMjA5NA== mschrimpf 5308236 2020-11-02T17:24:32Z 2020-11-02T17:24:32Z NONE

still relevant

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  single coordinate is overwritten with dimension by set_index 376953925
480379773 https://github.com/pydata/xarray/issues/1554#issuecomment-480379773 https://api.github.com/repos/pydata/xarray/issues/1554 MDEyOklzc3VlQ29tbWVudDQ4MDM3OTc3Mw== mschrimpf 5308236 2019-04-05T18:35:12Z 2019-04-05T18:35:12Z NONE

Any updates on this by any chance? (now that there seems to be progress on #1603)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  TypeError on DataArray.stack() if any of the dimensions to be stacked has a MultiIndex 255597950
435676969 https://github.com/pydata/xarray/issues/2537#issuecomment-435676969 https://api.github.com/repos/pydata/xarray/issues/2537 MDEyOklzc3VlQ29tbWVudDQzNTY3Njk2OQ== mschrimpf 5308236 2018-11-04T15:04:03Z 2018-12-01T20:06:11Z NONE

I think the issue stems from this line of code. Removing it leads to the desired MultiIndex. But I'm not sure if it leads to downstream issues, e.g selecting over coord1 then seems to also squeeze that dimension. Perhaps @benbovy and @jhamman can chime in as the owners of that line.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  single coordinate is overwritten with dimension by set_index 376953925
435560123 https://github.com/pydata/xarray/issues/2537#issuecomment-435560123 https://api.github.com/repos/pydata/xarray/issues/2537 MDEyOklzc3VlQ29tbWVudDQzNTU2MDEyMw== mschrimpf 5308236 2018-11-03T04:37:52Z 2018-11-03T04:37:52Z NONE

I would expect the dimension to become a MultiIndex with a single coordinate:

<xarray.DataArray (dim: 1)> array([0]) Coordinates: * dim MultiIndex coord (dim) int64 0

When there is more than one coordinate that is what happens, but not when there is only a single coordinate.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  single coordinate is overwritten with dimension by set_index 376953925
435548784 https://github.com/pydata/xarray/issues/2537#issuecomment-435548784 https://api.github.com/repos/pydata/xarray/issues/2537 MDEyOklzc3VlQ29tbWVudDQzNTU0ODc4NA== mschrimpf 5308236 2018-11-03T01:04:50Z 2018-11-03T01:12:53Z NONE

Sorry I missed a line in the issue, added it now. You're right about the .sel -- since it does not operate on dimension levels by default, I manually set the index: d.set_index(append=True, inplace=True, dim=['coord']). However, this now gets rid of the coordinate: <xarray.DataArray (dim: 1)> array([0]) Coordinates: * dim (dim) int64 0 and thus d.sel(coord=0) does not work because the coord has been discarded.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  single coordinate is overwritten with dimension by set_index 376953925
426329601 https://github.com/pydata/xarray/issues/2452#issuecomment-426329601 https://api.github.com/repos/pydata/xarray/issues/2452 MDEyOklzc3VlQ29tbWVudDQyNjMyOTYwMQ== mschrimpf 5308236 2018-10-02T15:58:21Z 2018-10-02T15:58:21Z NONE

I posted a manual solution to the multi-dimensional grouping in the stackoverflow thread. Hopefully, .sel can be made more efficient though, it's such an everyday method.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.sel extremely slow 365678022
426329058 https://github.com/pydata/xarray/issues/2438#issuecomment-426329058 https://api.github.com/repos/pydata/xarray/issues/2438 MDEyOklzc3VlQ29tbWVudDQyNjMyOTA1OA== mschrimpf 5308236 2018-10-02T15:56:53Z 2018-10-02T15:56:53Z NONE

I built a manual solution in the stackoverflow thread. Maybe this helps someone.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Efficient workaround to group by multiple dimensions 363629186
426100334 https://github.com/pydata/xarray/issues/2452#issuecomment-426100334 https://api.github.com/repos/pydata/xarray/issues/2452 MDEyOklzc3VlQ29tbWVudDQyNjEwMDMzNA== mschrimpf 5308236 2018-10-01T23:47:43Z 2018-10-02T14:29:18Z NONE

Thanks @max-sixty, the checks per call make sense, although I still find 0.5 ms insane for a single-value lookup (python dict-indexing takes about a 50th to index every single item in the array).

The reason I'm looking into this is actually multi-dimensional grouping (#2438) which is unfortunately not implemented (the above code is essentially a step towards trying to implement that). Is there a way of vectorizing these calls with that in mind? I.e. apply a method for each group.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.sel extremely slow 365678022
424444423 https://github.com/pydata/xarray/issues/2438#issuecomment-424444423 https://api.github.com/repos/pydata/xarray/issues/2438 MDEyOklzc3VlQ29tbWVudDQyNDQ0NDQyMw== mschrimpf 5308236 2018-09-25T18:06:11Z 2018-09-26T13:58:02Z NONE

Thanks @shoyer, Your comment helped me realize that at least part of the "horribly slow" probably stems from a DataArray with MultiIndex. The above code sample takes 5-6 seconds for 1000 b values. When stacking the DataArray beforehand with d = d.stack(adim=['a'], bdim=['b']), it takes around 14 seconds. However, both of these are unfortunately very slow compared to indexing in e.g. numpy or pandas.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Efficient workaround to group by multiple dimensions 363629186

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 961.11ms · About: xarray-datasette
  • Sort ascending
  • Sort descending
  • Facet by this
  • Hide this column
  • Show all columns
  • Show not-blank rows