home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where author_association = "MEMBER" and issue = 229370997 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 3

  • fujiisoup 4
  • shoyer 3
  • benbovy 2

issue 1

  • Multiindex scalar coords, fixes #1408 · 9 ✖

author_association 1

  • MEMBER · 9 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
303984274 https://github.com/pydata/xarray/pull/1412#issuecomment-303984274 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMzk4NDI3NA== fujiisoup 6815844 2017-05-25T11:04:55Z 2017-05-25T11:04:55Z MEMBER

Replaced by a new PR #1426 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302919742 https://github.com/pydata/xarray/pull/1412#issuecomment-302919742 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjkxOTc0Mg== fujiisoup 6815844 2017-05-21T07:13:37Z 2017-05-21T07:13:37Z MEMBER

@benbovy Thanks for the valuable comments. Actually I can not fully imagine how the actual implementation looks like currently, but I also think the virtual variable access needs some tricks. This is an essential functionality of the MultiIndex-coordinate, I will try to investigate it.

Thanks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302883780 https://github.com/pydata/xarray/pull/1412#issuecomment-302883780 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjg4Mzc4MA== benbovy 4160723 2017-05-20T16:31:43Z 2017-05-20T16:31:43Z MEMBER

I also agree that a MultiIndex wrapper would be to be the way to go. I think I understand the diagrams, but what remains a bit unclear to me is how this could be implemented.

In particular, how would this wrapper work with IndexVariable?

Currently, IndexVariable warps either a pandas.Index or a pandas.MultiIndex and for the latter case IndexVariable.get_level_variable can generate new IndexVariable objects so that MultiIndex levels are accessible as "virtual coordinates".

Would IndexVariable warp a MultiIndex wrapper instead (levels + scalars), and also be able to generate new scalar Variable objects that will be accessible as virtual coordinates?

This is maybe slightly off topic, but more generally I'm also wondering how this would co-exist with potentially other kinds of multi-level indexes (see this comment).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302878939 https://github.com/pydata/xarray/pull/1412#issuecomment-302878939 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjg3ODkzOQ== shoyer 1217238 2017-05-20T15:11:24Z 2017-05-20T15:11:24Z MEMBER

@fujiisoup Yes, the solution of writing a MultiIndex wrapper for xarray looks much cleaner to me. I like the look of this proposal! (Those diagrams are also very helpful)

I guess this could be implemented as a pandas.MultiIndex along with a list of scalar coordinates?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302693433 https://github.com/pydata/xarray/pull/1412#issuecomment-302693433 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjY5MzQzMw== fujiisoup 6815844 2017-05-19T12:47:28Z 2017-05-19T12:47:28Z MEMBER

I also agree. It seems too magical.

But I slightly changed my mind. I notice what I really want to have is not particular scalar coordinate in MultiIndex, but 'unified' interface between normal Vraiable and MultiIndex.

The current structure is illustrated as follows,

The MultiIndex has different characteristics from normal Variable. For example, if we do ds.sel(x=2), it makes a scalar coordinate and normal Variable. The backward process might be .expand_dims().stack(). This is different from normal Variable behavior. And because of it, MultiIndex should be treated in special way in every place. (Deprecating the automatic-renaming does not change things so much.)

I am wondering if we could have the following class structure things become simpler

In this picture, MultiIndex can have scalar as its level and .isel() produces it. This process can be traced backward by .expand_dims() or .concat() as in normal Variable.

I understand it is different from pandas.MultiIndex structure, and we need to expand our wrapper extensively if we decide to realize it (as written in red). But I feel this symmetric structure could make it easy to expand MultiIndex functionalities in future.

Any thoughts are welcome. (Should move discussion to another issue?)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302425186 https://github.com/pydata/xarray/pull/1412#issuecomment-302425186 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjQyNTE4Ng== shoyer 1217238 2017-05-18T14:42:57Z 2017-05-18T14:42:57Z MEMBER

To me it looks like it is a bit too magical, but just wondering what you think...

Agreed, this also seems too magical to me.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302358284 https://github.com/pydata/xarray/pull/1412#issuecomment-302358284 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjM1ODI4NA== benbovy 4160723 2017-05-18T09:56:28Z 2017-05-18T09:58:06Z MEMBER

A possible direction to reduce the if statements in many different places would be to just return pos_indexers in indexing.remap_level_indexers - as it was the case before adding multi-index support - and instead put in Dataset.isel all the logic for checking MultiIndex and maybe convert it to Index and/or scalar coordinates and maybe rename dimension.

This would simplify many things, although I haven't thought about about all other possible issues it would create (perfomance, etc.). Also, DataArray.loc doesn't seem to use Dataset.isel.

Here is another case related to this PR. From the example in the linked issue, the current behavior is

python In [9]: ds.isel(yx=[0, 1]) Out[9]: <xarray.Dataset> Dimensions: (yx: 2) Coordinates: * yx (yx) MultiIndex - y (yx) object 'a' 'a' - x (yx) int64 1 2 Data variables: foo (yx) int64 1 2

Do we want to also change the behavior to this?

python In [10]: ds.isel(yx=[0, 1]) Out[10]: <xarray.Dataset> Dimensions: (x: 2) Coordinates: * x (x) int64 1 2 y object 'a' Data variables: foo (x) int64 1 2

To me it looks like it is a bit too magical, but just wondering what you think...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302299557 https://github.com/pydata/xarray/pull/1412#issuecomment-302299557 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjI5OTU1Nw== fujiisoup 6815844 2017-05-18T04:48:29Z 2017-05-18T04:48:29Z MEMBER

@shoyer Thanks for the comment.

It breaks an important invariant, which is that indexing a Variable returns another Variable.

I totally agree with you.

In the last commit, I moved the unpacking functionality into Dataset, and restored the modification in Variable class I made. I think the current is cleaner than my previous one, but I'm not yet comfortable with it. There are a lot of functions or if-statements related to MultiIndex in different places. I guess they should be bundled in one place.

Adding functions is easy but simplifying them are difficult...

If anyone show a direction, I will try the improvement.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997
302170558 https://github.com/pydata/xarray/pull/1412#issuecomment-302170558 https://api.github.com/repos/pydata/xarray/issues/1412 MDEyOklzc3VlQ29tbWVudDMwMjE3MDU1OA== shoyer 1217238 2017-05-17T17:42:58Z 2017-05-17T17:42:58Z MEMBER

variable.getitem now returns an OrderedDict if a single element is selected from MultiIndex.

I don't like this change. It breaks an important invariant, which is that indexing a Variable returns another Variable.

I do agree with indexing along a MultiIndex dimension should unpacking the tuple for coordinates, but only for coordinates. So this needs to be somewhere in the Dataset.isel logic, not Variable.isel.

Consider indexing ds['yx'] from your example in the linked issue. With the current version of xarray: ``` In [7]: ds['yx'] Out[7]: <xarray.DataArray 'yx' (yx: 6)> array([('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3)], dtype=object) Coordinates: * yx (yx) MultiIndex - y (yx) object 'a' 'a' 'a' 'b' 'b' 'b' - x (yx) int64 1 2 3 1 2 3

In [8]: ds['yx'][0] Out[8]: <xarray.DataArray 'yx' ()> array(('a', 1), dtype=object) Coordinates: yx object ('a', 1) We want to change the indexing behavior to this: In [8]: ds['yx'][0] Out[8]: <xarray.DataArray 'yx' ()> array(('a', 1), dtype=object) Coordinates: y object 'a' x int64 1 ``` But we don't want to change what happens to the DataArray itself -- it should still be a scalar object array.

I tested this example on your PR branch, and it actually crashes with KeyError.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Multiindex scalar coords, fixes #1408 229370997

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 14.086ms · About: xarray-datasette