home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where issue = 262642978 and user = 6815844 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • fujiisoup · 6 ✖

issue 1

  • Explicit indexes in xarray's data-model (Future of MultiIndex) · 6 ✖

author_association 1

  • MEMBER 6
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
442809859 https://github.com/pydata/xarray/issues/1603#issuecomment-442809859 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDQ0MjgwOTg1OQ== fujiisoup 6815844 2018-11-29T12:05:03Z 2018-11-29T12:05:03Z MEMBER

I am late for the party (but still only have time to write a short comment). I am a big fan of MultiIndex and like @shoyer 's idea.

ds.sel(multi=list_of_pairs) can probably be replaced by ds.sel(x=..., y=...), but how about reindex along MultiIndex? I have encountered its use cases several times. I also think it would be nice to have MultiIndex as a variable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
379920389 https://github.com/pydata/xarray/issues/1603#issuecomment-379920389 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDM3OTkyMDM4OQ== fujiisoup 6815844 2018-04-09T23:03:03Z 2018-04-09T23:04:01Z MEMBER

@shoyer, thank you for detailing.

I am thinking how can we establish the following selecting-concatenating behavior with MultiIndex(-like) coordinate with our new Indexes machinary, xr.concat([da.isel(x=i) for i in range(len(da['x':))], dim='x') Personally, I think it would be nice if we could recover the original Index structue. We may need to track Indexes object even when the corresponding dimension becomes one dimensional? But scalar index sounds strange...

Or, we may give up to restore the original coordinate structure during the above action, but stil keep them as ordinary coodinates.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
336381864 https://github.com/pydata/xarray/issues/1603#issuecomment-336381864 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDMzNjM4MTg2NA== fujiisoup 6815844 2017-10-13T08:09:25Z 2017-10-13T08:09:25Z MEMBER

Thanks for the details. (Sorry for my late responce. It took a long for me to understand what does it look like.)

I am wondering what the advantageous cases which are realized with this Index concept are. As far as my understanding is correct,

  1. It will enable more flexible indexing, e.g. more than one Indexes are associated with one dimension and we can select from these coordinate values very flexibly.
  2. It will naturally integrate more advanced Indexes such as KDTree

Are they correct?

Probably the most elegant rule would again be to check all indexed variables for exact matches.

That sounds reasonable.

In principle, this data model would allow for two mostly equivalent indexing schemes: MultiIndex[time, space] vs two indexes Index[time] and Index[space].

I like the latter one, as it is easier to understand even for non-pandas users.

What does the actual implementation look like? xr.Dataset.indexes will be an OrderedDict that maps from variable's name to its associated dimension? Actual instance of Index will be one of xr.Dataset.variables?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
334125888 https://github.com/pydata/xarray/issues/1603#issuecomment-334125888 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDMzNDEyNTg4OA== fujiisoup 6815844 2017-10-04T11:25:14Z 2017-10-04T12:43:59Z MEMBER

@shoyer, could you add more details of this idea? I think I do not yet fully understand the practical difference between dim and index.

  1. Use cases of the independent Index and dims Would it be general cases where dimension and index are independent? (It is the case only for MultiIndex and KDtree)?

  2. MultiIndex implementation In MultiIndex case, will a xarray object store a MultiIndex object and also the level variables as Variable objects (there will be some duplicates)? If indexes[dim] returns multiple Variables, which realizes a MultiIndex-like structure without pd.MultiIndex, indexes would be very different from dim, because a single dimension can have multiple indexes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
334043044 https://github.com/pydata/xarray/issues/1603#issuecomment-334043044 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDMzNDA0MzA0NA== fujiisoup 6815844 2017-10-04T03:51:57Z 2017-10-04T03:51:57Z MEMBER

I think we currently assume variables[dim] is an Index. Does your proposal means that Dataset will keep an additional attribute indexes, and indexes[dim] gives a pd.Index (or pd.MultiIndex, KDTree)?

It sounds a much cleaner data model.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978
334029215 https://github.com/pydata/xarray/issues/1603#issuecomment-334029215 https://api.github.com/repos/pydata/xarray/issues/1603 MDEyOklzc3VlQ29tbWVudDMzNDAyOTIxNQ== fujiisoup 6815844 2017-10-04T01:55:02Z 2017-10-04T01:55:02Z MEMBER

I'm using MultiIndex a lot, but I noticed that it is just a workaround to index along multiple kinds of coordinate.

Consider the following example, ```python In [1]: import numpy as np ...: import xarray as xr ...: da = xr.DataArray(np.arange(5), dims=['x'], ...: coords={'experiment': ('x', [0, 0, 0, 1, 1]), ...: 'time': ('x', [0.0, 0.1, 0.2, 0.0, 0.15])}) ...:

In [2]: da Out[2]: <xarray.DataArray (x: 5)> array([0, 1, 2, 3, 4]) Coordinates: experiment (x) int64 0 0 0 1 1 time (x) float64 0.0 0.1 0.2 0.0 0.15 Dimensions without coordinates: x

```

I want to do something like this python da.sel(experiment=0).sel(time=0.1) but it cannot. MultiIndexing enables this,

python In [2]: da = da.set_index(exp_time=['experiment', 'time']) ...: da ...: Out[2]: <xarray.DataArray (x: 5)> array([0, 1, 2, 3, 4]) Coordinates: * exp_time (exp_time) MultiIndex - experiment (exp_time) int64 0 0 0 1 1 - time (exp_time) float64 0.0 0.1 0.2 0.0 0.15 Dimensions without coordinates: x

If we could make a selection from a non-index coordinate, MultiIndex is not necessary for this case.

I think there should be other important usecases of MultiIndex. I would be happy if anyone could list them in this issue.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Explicit indexes in xarray's data-model (Future of MultiIndex) 262642978

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 59.214ms · About: xarray-datasette