home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where issue = 1293460108 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • benbovy 4
  • lukasbindreiter 3
  • dcherian 1

author_association 2

  • MEMBER 5
  • CONTRIBUTOR 3

issue 1

  • MultiIndex listed multiple times in Dataset.indexes property · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1236935368 https://github.com/pydata/xarray/issues/6752#issuecomment-1236935368 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85JuiLI benbovy 4160723 2022-09-05T12:21:54Z 2022-09-05T12:21:54Z MEMBER

That can probably be closed then, since it was an intentional change.

Yes I think we can close it. Thanks for your feedback and for the issue report!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1236931851 https://github.com/pydata/xarray/issues/6752#issuecomment-1236931851 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85JuhUL benbovy 4160723 2022-09-05T12:19:08Z 2022-09-05T12:19:08Z MEMBER

But finding information about those changes right now was not so easy, is there some resource available where I can read up about the changes to indexes and functions related to them.

Not yet, this still has to be detailed in the documentation (tracked in #6293 along with other todo items related to indexes). The Indexes API already has some basic docstrings, though: https://github.com/pydata/xarray/blob/main/xarray/core/indexes.py#L1008-L1225

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1236917784 https://github.com/pydata/xarray/issues/6752#issuecomment-1236917784 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85Jud4Y lukasbindreiter 21131639 2022-09-05T12:07:13Z 2022-09-05T12:07:13Z CONTRIBUTOR

Thanks for the suggestions, I'll look into this Indexes.group_by_index and see if that is able to resolve our issue.

And with regards to the (de)serialization: I haven't investigated yet how the index changes in 2022.6 affect our initial usecase, maybe an approach such as the suggested with a custom backend may be even a better solution for us then. Though probably we would ideally need a way to override NetCDF4BackendEntrypoint with another Entrypoint as the default for .nc files

As for the original issue discussed here: That can probably be closed then, since it was an intentional change. But finding information about those changes right now was not so easy, is there some resource available where I can read up about the changes to indexes and functions related to them. (e.g. I was unaware of the existence of xindexes or Indexes.group_by_index)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1236756285 https://github.com/pydata/xarray/issues/6752#issuecomment-1236756285 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85Jt2c9 benbovy 4160723 2022-09-05T09:26:36Z 2022-09-05T09:26:36Z MEMBER

Thanks for the issue report @lukasbindreiter, I opened #6987. As a workaround, you could use Indexes.group_by_index(), which shouldn't have any hash issue and which might be better fitted for your use case.

Regarding (de)serialization (from)to netCDF or other formats, I wonder if building multi-indexes or other custom indexes when opening the dataset couldn't be done via some custom Xarray IO backend (https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html). I'm not sure how easy / hard it is to implement a custom backend on top of an existing one, though. For the serialization, Xarray doesn't support custom writable backends (yet), but since multi-index levels are now real coordinates maybe a custom backend is not really needed. Right now Xarray raises NotImplementedError when trying to save a variable wrapping a multi-index, but probably we could just get rid of the multi-index "dimension" coordinate (tuple elements) and save level coordinates like any other variable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1236717180 https://github.com/pydata/xarray/issues/6752#issuecomment-1236717180 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85Jts58 lukasbindreiter 21131639 2022-09-05T08:51:24Z 2022-09-05T08:51:24Z CONTRIBUTOR

@benbovy I also just tested the get_unique() method that you mentioned and maybe noticed a related issue here, which I'm not sure is wanted / expected.

Taking the above dataset ds, accessing this function results in an error:

```python

ds.indexes.get_unique()

TypeError: unhashable type: 'MultiIndex' ```

However, for xindexes it works: ```python

ds.xindexes.get_unique()

[<xarray.core.indexes.PandasMultiIndex at 0x7f105bf1df20>] ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1236714136 https://github.com/pydata/xarray/issues/6752#issuecomment-1236714136 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85JtsKY lukasbindreiter 21131639 2022-09-05T08:48:23Z 2022-09-05T08:48:23Z CONTRIBUTOR

We used the .indexes property to implement a workaround for serializing Datasets containing multiindices to netCDF. For this the implementation basically looked like this: (Inspired by and related to this issue: https://github.com/pydata/xarray/issues/1077)

Saving dataset as NetCDF: 1. Loop over dataset.indexes 2. Check if index is a multi index 3. If so, encode it somehow and save it as attribute in the dataset 4. Reset (remove) the index 5. Now save this "patched" dataset as NetCDF

And then loading it again: 1. Load the dataset 2. Check if this special attribute exists 3. If so decode the multiindex and set it as index in the dataset

When testing the pre-release version I noticed some of our tests failing, which is why I raised this issue in the first place - in case those changes were unwanted. I was not aware that you were actively working on multi index changes and therefore expecting API changes here. With that in mind I'll probably be able to adapt our code to this new API of indexes and xindexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1233123855 https://github.com/pydata/xarray/issues/6752#issuecomment-1233123855 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85Jf_oP benbovy 4160723 2022-08-31T15:55:28Z 2022-08-31T15:55:28Z MEMBER

The change is because starting from version 2022.6.0, multi-index level coordinates are no longer "virtual" but now correspond to real coordinates. The .indexes and .xindexes properties are mappings relating coordinates to their index.

There has been some discussions prior to the explicit indexes refactor about whether those properties should return a mapping of a unique vs. non-unique index objects. We choose the latter as it simplifies a lot of things internally (and perhaps externally too).

@lukasbindreiter although it is unlikely that we'll change this in the future, it would be interesting to get your feedback! How does this choice impact your workflow?

Note that both .indexes and .xindexes return an Indexes object, which has a convenient .get_unique() method that returns a list of unique index objects. It also has other convenient methods, although those are not well documented yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108
1175224966 https://github.com/pydata/xarray/issues/6752#issuecomment-1175224966 https://api.github.com/repos/pydata/xarray/issues/6752 IC_kwDOAMm_X85GDIKG dcherian 2448579 2022-07-05T15:59:00Z 2022-07-05T15:59:00Z MEMBER

Thanks for trying out our pre-release @lukasbindreiter !

This is an intentional change.

Can you tell us more about why this breaks your code?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex listed multiple times in Dataset.indexes property 1293460108

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.689ms · About: xarray-datasette