home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 1358841264 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • benbovy 2
  • TomNicholas 2
  • dcherian 1

issue 1

  • Add documentation on custom indexes · 5 ✖

author_association 1

  • MEMBER 5
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1247157234 https://github.com/pydata/xarray/pull/6975#issuecomment-1247157234 https://api.github.com/repos/pydata/xarray/issues/6975 IC_kwDOAMm_X85KVhvy TomNicholas 35968931 2022-09-14T18:35:34Z 2022-09-14T18:37:48Z MEMBER

We should clarify that the aim of Index objects is to make more efficient all the operations made in the (discrete or continuous) space defined by the coordinate labels. That space is distinct from the discrete space defined by array element locations.

I think this should be one of the first things said. It defines what all the following discussion of Indexes does and does not affect.

I've tried to explain it in the "Index base class" section and the sections below, but maybe it should be emphasized more?

Yeah I think you do actually have that one covered, I just included it as another example of a naive question that everyone will have that is worth heading off very explicitly.

When I load the "air" tutorial data and it shows a Float64Index and DateTime64Index, where did they come from?

I guess you mean it is shown through ds.indexes?

ds.xindexes (vs. ds.indexes) still needs to be added in the docs (in a later PR?), which hopefully will address your concern here.

I meant like when did these indexes get automatically built? (Presumably on coordinate assignment)

Maybe Index also deserves its own entry there, where we could explain what indexes are, how they are different from variables (coordinates), how they are used or accessed in Xarray, etc.

1000% yes we need a page that explains what Index objects are, what they do, and how they work, and how they are handled automatically by default. This is pre-requisite knowledge (which apparently I don't have :sweat_smile: ) before trying to build your own custom index.

Overall, I think that the whole "Xarray Internals" section could be streamlined beyond a bunch of loosely-coupled document pages.

Probably, but having a loosely coupled page for each aspect of the internals would be a good initial aim.

I agree that we need more examples, but I also think that too much examples may tend to make things more confused.

That's why I like the "Explanation" vs "How-to" vs "Tutorials" distinction: use minimal code in the "Explanation" section (this PR) but put multiple more complex examples under "How to create a functionally-derived index", "how-to create a lazy index" etc.

Is it possible to do that with Sphinx / RST?

No idea, but that does look cool!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add documentation on custom indexes 1358841264
1246487152 https://github.com/pydata/xarray/pull/6975#issuecomment-1246487152 https://api.github.com/repos/pydata/xarray/issues/6975 IC_kwDOAMm_X85KS-Jw benbovy 4160723 2022-09-14T09:25:24Z 2022-09-14T09:25:24Z MEMBER

Thanks @dcherian and @TomNicholas for your feeback!

@dcherian I will reply to your inline comments when I'll integrate your suggestions in this PR.

@TomNicholas I answer to your comments below.

Bear in mind I don't think I've ever contributed a PR to xarray that touched indexes.py or indexing.py

That's exactly why your feedback is valuable!

When is my CustomIndex object consulted? Is it potentially for all basic operations (concat, join, align, indexing, etc?)

I agree this could be detailed more in the Index API docstrings in a consistent way. For some methods like equals, join and reindex_like it could be called in a lot of places, basically everything that relies on object alignment.

Why does there not need to be an index (in .indexes) if I do indexing with e.g. .isel but have no coordinates?

We should clarify that the aim of Index objects is to make more efficient all the operations made in the (discrete or continuous) space defined by the coordinate labels. That space is distinct from the discrete space defined by array element locations. All operations made in the latter space don't require any index.

Some Index API like Index.isel suggest otherwise, but those methods are rather for convenience, i.e., avoid users having to rebuild an index from scratch when it could be easily built from the existing one.

How is an xarray.PandasIndex different from a pd.Index?

I've tried to explain it in the "Index base class" section and the sections below, but maybe it should be emphasized more?

When I load the "air" tutorial data and it shows a Float64Index and DateTime64Index, where did they come from?

I guess you mean it is shown through ds.indexes?

ds.xindexes (vs. ds.indexes) still needs to be added in the docs (in a later PR?), which hopefully will address your concern here.

Finally it's not great that to explain Index objects we have to assume the user knows what xarray.Variable is, but Variable is still not really public API, and certainly isn't documented as comprehensively as DataArray and Dataset are.

I agree, although Variable is already documented in the "Xarray internals" section. Maybe Index also deserves its own entry there, where we could explain what indexes are, how they are different from variables (coordinates), how they are used or accessed in Xarray, etc.

Overall, I think that the whole "Xarray Internals" section could be streamlined beyond a bunch of loosely-coupled document pages.

I also think we need multiple simple examples.

I agree that we need more examples, but I also think that too much examples may tend to make things more confused.

One thing that I like very much in https://fastapi.tiangolo.com/ is how a small example is picked for each tutorial and then is shown by highlighting the relevant code for every subsection. Is it possible to do that with Sphinx / RST?

It's hard to show all features through one succinct example, though. Like @dcherian says in https://github.com/pydata/xarray/pull/6975#discussion_r967495773, we could invite people to look into the PandasIndex and PandasMultiIndex code for more details. My hope is that there will be more real examples (multi-coordinate, multi-dimensions) available in the future.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add documentation on custom indexes 1358841264
1246192701 https://github.com/pydata/xarray/pull/6975#issuecomment-1246192701 https://api.github.com/repos/pydata/xarray/issues/6975 IC_kwDOAMm_X85KR2Q9 TomNicholas 35968931 2022-09-14T03:41:06Z 2022-09-14T03:43:31Z MEMBER

pydata/xarray your feedback would be very much appreciated! I've been into this for quite some time, so there may be things that seem obvious to me but that you can still find very confusing or non-intuitive. It would then deserve some extra or better explanation.

I find the way in which the Index objects are involved in a method call (e.g. sel) pretty opaque still. (Bear in mind I don't think I've ever contributed a PR to xarray that touched indexes.py or indexing.py :sweat_smile:)

  • When is my CustomIndex object consulted? Is it potentially for all basic operations (concat, join, align, indexing, etc?) If I passed some kind of NotImplementedIndex which only defined .from_variables, what functionality would be left in xarray?
  • Why does there not need to be an index (in .indexes) if I do indexing with e.g. .isel but have no coordinates?
  • If I index along two dimensions simultaneously (.isel(x=1, y=2)) does that correspond to two separate Index consultations?
  • How is an xarray.PandasIndex different from a pd.Index?
  • When I load the "air" tutorial data and it shows a Float64Index and DateTime64Index, where did they come from?
  • Finally it's not great that to explain Index objects we have to assume the user knows what xarray.Variable is, but Variable is still not really public API, and certainly isn't documented as comprehensively as DataArray and Dataset are.

Perhaps these questions are too specific to xarray's internals but I do think there should be some kind of mental model given as to the role the Index objects play. (This could be a white lie, similar to how our page on data structures says that Dataset objects contain DataArray objects when actually they technically don't.)

I also think we need multiple simple examples. How about - HelloWorldIndex, that just prints "I'm calling HelloWorldIndex.isel !" etc. - PeriodicBoundaryIndex (a partial/simpler implementation of #7031), - A simple functionally-derived index, that consults a dynamically-called exponential function or something. - Some example of a custom multi-index, perhaps some kind of 2D lat-lon thing? Or how about something that represents 2D image distortion, like using it creates a fisheye effect?

I find it helps to think of documentation using this 4-part system. This PR should cover "Explanation" pretty well, but we should still aim for other content to better cover "Tutorial", "How-to Guides", and "Reference". "Tutorial" could be like a notebook walking through creating a simple index (e.g. PeriodicIndex) from scratch, explaining and fixing errors as they arise (like @dcherian did for apply_ufunc). "How-to Guides" might be specific to other more advanced custom index examples. I guess "Reference" here is just having really clear docstrings on the possible methods of Index somewhere (we can't really do that with an ABC though can we?).

That all said, this is already a great start!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add documentation on custom indexes 1358841264
1242492777 https://github.com/pydata/xarray/pull/6975#issuecomment-1242492777 https://api.github.com/repos/pydata/xarray/issues/6975 IC_kwDOAMm_X85KDu9p dcherian 2448579 2022-09-09T21:24:42Z 2022-09-09T21:24:42Z MEMBER

an inefficient "numpy" index with basic lookup

yes! I used this recently to describe what an index does. I think most people are familiar with the argmin way

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add documentation on custom indexes 1358841264
1239526376 https://github.com/pydata/xarray/pull/6975#issuecomment-1239526376 https://api.github.com/repos/pydata/xarray/issues/6975 IC_kwDOAMm_X85J4avo benbovy 4160723 2022-09-07T15:15:43Z 2022-09-07T15:15:43Z MEMBER

I'm open to any suggestion on how to better illustrate this with clear and succinct examples.

Maybe an inefficient "numpy" index with basic lookup (like in https://github.com/pydata/xarray/pull/3925#issuecomment-609471635) would be a good example?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add documentation on custom indexes 1358841264

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.039ms · About: xarray-datasette