home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 510844652 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • Hoeze 1
  • shoyer 1
  • benbovy 1
  • max-sixty 1
  • crusaderky 1

author_association 2

  • MEMBER 4
  • NONE 1

issue 1

  • Scalar slice of MultiIndex is turned to tuples · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
919981497 https://github.com/pydata/xarray/issues/3432#issuecomment-919981497 https://api.github.com/repos/pydata/xarray/issues/3432 IC_kwDOAMm_X8421c25 benbovy 4160723 2021-09-15T12:38:29Z 2021-09-15T12:38:29Z MEMBER

@Hoeze this is now implemented in #5692 (stack is not yet refactored so I reproduced your example in a slightly different way):

```python

stacked.isel(observations=1) <xarray.Dataset> Dimensions: (genes: 2) Coordinates: * genes (genes) <U1 'a' 'b' observations object ('c', 'f') individuals <U1 'c' subtissues <U1 'f' Data variables: test (genes) int64 2 2

```

{
    "total_count": 2,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 1,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652
545197414 https://github.com/pydata/xarray/issues/3432#issuecomment-545197414 https://api.github.com/repos/pydata/xarray/issues/3432 MDEyOklzc3VlQ29tbWVudDU0NTE5NzQxNA== shoyer 1217238 2019-10-22T23:24:45Z 2019-10-22T23:24:45Z MEMBER

I think the right long-term solution for xarray is to always store separate Variable objects for MultiIndex levels, and only use the MultiIndex for proper indexing. When you index out a single value, the MultiIndex will naturally disappear and you'll be left with a bunch of scalar coordinates, without any special case logic to handle the MultiIndex.

This looks like @crusaderky's third option.

We'll need to finish up the big "explicit indexes" refactor first to make this viable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652
545155786 https://github.com/pydata/xarray/issues/3432#issuecomment-545155786 https://api.github.com/repos/pydata/xarray/issues/3432 MDEyOklzc3VlQ29tbWVudDU0NTE1NTc4Ng== crusaderky 6213168 2019-10-22T21:07:19Z 2019-10-22T21:07:19Z MEMBER

Not a regression. I've gone back as far as xarray 0.12 and pandas 0.19 and it's always been like this. I agree it's bad and needs to be fixed though.

The issue is inherited straight from pandas: ```python

df = stacked.test.to_pandas() df

individuals c d
subtissues e f e f genes
a 1 2 3 4 b 1 2 3 4

df.iloc[:, 1]

genes a 2 b 2 Name: (c, f), dtype: int64 ``` I'm not sure if we should write an ad-hoc object in xarray for scalar multiindices.

The alternative is to think of a more systematic solution in pandas, which likely implies creating an ad-hoc subclass of tuple which is basically a pickle-able namedtuple. It must be a subclass of tuple otherwise it will break things for a lot of people around the world (the userbase of pandas is MUCH larger than xarray's). And it must be serializable for obvious reasons.

In both cases, the size of this change is very large.

The third and significantly easier option is that, on sel/isel, xarray should automatically unstack any scalar slices of a multiindex. Meaning that the 'observations' coord would simply disappear, leaving only 'individuals' and 'subtissues'. However, It would carry the problem that, if one cuts a scalar slice and a vector slice from the dimension, he won't be able to concatenate them back together.

@shoyer what's your opinion?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652
545134181 https://github.com/pydata/xarray/issues/3432#issuecomment-545134181 https://api.github.com/repos/pydata/xarray/issues/3432 MDEyOklzc3VlQ29tbWVudDU0NTEzNDE4MQ== Hoeze 1200058 2019-10-22T20:14:38Z 2019-10-22T20:17:26Z NONE

@max-sixty here you go: ```python3 import xarray as xr

print(xr.version)

ds = xr.Dataset({ "test": xr.DataArray( [[[1,2],[3,4]], [[1,2],[3,4]]], dims=("genes", "individuals", "subtissues"), coords={ "genes": ["a", "b"], "individuals": ["c", "d"], "subtissues": ["e", "f"], } ) }) print(ds)

stacked = ds.stack(observations=["individuals", "subtissues"]) print(stacked)

print(stacked.isel(observations=1)) ```

result: <xarray.Dataset> Dimensions: (genes: 2) Coordinates: * genes (genes) <U1 'a' 'b' observations object ('c', 'f') Data variables: test (genes) int64 2 2

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652
545119739 https://github.com/pydata/xarray/issues/3432#issuecomment-545119739 https://api.github.com/repos/pydata/xarray/issues/3432 MDEyOklzc3VlQ29tbWVudDU0NTExOTczOQ== max-sixty 5635139 2019-10-22T19:36:51Z 2019-10-22T19:36:51Z MEMBER

Do you have a reproducible example, as per the issue instructions?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Scalar slice of MultiIndex is turned to tuples 510844652

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 17.108ms · About: xarray-datasette