home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 931591247 and user = 28786187 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • st-bender · 7 ✖

issue 1

  • Increase default `display_max_rows` · 7 ✖

author_association 1

  • CONTRIBUTOR 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
874209908 https://github.com/pydata/xarray/issues/5545#issuecomment-874209908 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg3NDIwOTkwOA== st-bender 28786187 2021-07-05T15:58:00Z 2021-07-05T17:43:09Z CONTRIBUTOR

Hi, @max-sixty I could give it a try, but my time is quite limited. Would you be fine with a diff? That would save me a bit from setting up a fork and new repo.

Anyway, here is a quick diff, I tried to keep it small and basically moved the max_rows setting to dataset_repr, only coords_repr takes a new keyword argument, so that should be backwards compatible. The tests would need to be updated. Maybe it is a good idea to not test _mapping_repr, but instead test coords_repr, data_vars_repr, attrs_repr, and dataset_repr separately, to check that they do what they are supposed to do regardless of their implementation?

Edit: Never mind, I am preparing a PR with updated tests.

```diff diff --git a/xarray/core/formatting.py b/xarray/core/formatting.py index 07864e81..ab30facf 100644
--- a/xarray/core/formatting.py
+++ b/xarray/core/formatting.py @@ -377,14 +377,12 @@ def _mapping_repr(
):
if col_width is None: col_width = _calculate_col_width(mapping) - if max_rows is None:
- max_rows = OPTIONS["display_max_rows"]
summary = [f"{title}:"] if mapping:
len_mapping = len(mapping) if not _get_boolean_with_default(expand_option_name, default=True): summary = [f"{summary[0]} ({len_mapping})"] - elif len_mapping > max_rows:
+ elif max_rows is not None and len_mapping > max_rows:
summary = [f"{summary[0]} ({max_rows}/{len_mapping})"] first_rows = max_rows // 2 + max_rows % 2 items = list(mapping.items())
@@ -416,7 +414,7 @@ attrs_repr = functools.partial(
)

-def coords_repr(coords, col_width=None):
+def coords_repr(coords, col_width=None, max_rows=None):
if col_width is None: col_width = _calculate_col_width(_get_col_items(coords)) return _mapping_repr( @@ -425,6 +423,7 @@ def coords_repr(coords, col_width=None): summarizer=summarize_coord, expand_option_name="display_expand_coords", col_width=col_width, + max_rows=max_rows, ) @@ -542,21 +541,22 @@ def dataset_repr(ds): summary = ["<xarray.{}>".format(type(ds).name)]

 col_width = _calculate_col_width(_get_col_items(ds.variables))
  • max_rows = OPTIONS["display_max_rows"]

    dims_start = pretty_print("Dimensions:", col_width) summary.append("{}({})".format(dims_start, dim_summary(ds)))

    if ds.coords: - summary.append(coords_repr(ds.coords, col_width=col_width)) + summary.append(coords_repr(ds.coords, col_width=col_width, max_rows=max_rows))

    unindexed_dims_str = unindexed_dims_repr(ds.dims, ds.coords) if unindexed_dims_str: summary.append(unindexed_dims_str)

  • summary.append(data_vars_repr(ds.data_vars, col_width=col_width))

  • summary.append(data_vars_repr(ds.data_vars, col_width=col_width, max_rows=max_rows))

    if ds.attrs: - summary.append(attrs_repr(ds.attrs)) + summary.append(attrs_repr(ds.attrs, max_rows=max_rows))

    return "\n".join(summary)

```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
873193513 https://github.com/pydata/xarray/issues/5545#issuecomment-873193513 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg3MzE5MzUxMw== st-bender 28786187 2021-07-02T18:46:43Z 2021-07-02T18:46:43Z CONTRIBUTOR

@benbovy That sounds good to me. If I may add, I would leave __repr__ and __str__ to return the same things, since people seem to use them interchangeably, e.g. in tutorials, and probably in their own code and notebooks.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
872424026 https://github.com/pydata/xarray/issues/5545#issuecomment-872424026 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg3MjQyNDAyNg== st-bender 28786187 2021-07-01T17:26:23Z 2021-07-01T17:26:23Z CONTRIBUTOR

@max-sixty I apologize if I hurt someone, but it is hard to find a solution if we can't agree on the problem. Try the same examples with 50 or 100 instead of 2000 variables to understand what I mean. And to be honest, I found your comments a bit dismissive and not exactly welcoming too, which is probably also not your intention.

From what I see in the examples by @Illviljan , setting display_max_rows affects everything equally, coords, data_vars, and attrs. So there would be no need to treat them separately. Or I misunderstood your comment.

Anyway, I think I made my point, I leave it up to you to decide what you are comfortable with.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
871674435 https://github.com/pydata/xarray/issues/5545#issuecomment-871674435 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg3MTY3NDQzNQ== st-bender 28786187 2021-06-30T19:36:26Z 2021-06-30T19:36:26Z CONTRIBUTOR

Hi @Illviljan, As I mentioned earlier, your "solution" is not backwards compatible, and it would be counterproductive to update the doctest. Which is also not relevant here and a different issue.

I am not sure what you are trying to show, your datasets look very different from what I am working with, and they miss the point. Then again they also prove my point, pandas and numpy shorten in a canonical way (except the finite number of columns, which may make sense, but I don't like that either and would rather have it wrap but show all columns). xarray doesn't because usually the variables are not simply numbered as in your example.

I am talking about medium sized datasets of a few 10 to maybe a few 100 non-canonical data variables. Have a look at http://cfconventions.org/ to get an impression of real-world variable names, or the example linked above in comment https://github.com/pydata/xarray/issues/5545#issuecomment-870109486. There it would be nice to have an overview over all of them.

If too many variables are a problem, imo it would have been better to say: "We keep it as it is, however, if it is a problem for your large dataset, here is an option to reduce the amount of output: ..." And put that into the docs or the wiki or FAQ or something similar. Note that the initial point in the linked issue is about the time it takes to print all variables, not the amount that gets shown. And usually the number of coordinates and attributes is smaller than the number of data variables. It also depends on what you call "screen", my terminal has currently 48 lines (about 56 in fullscreen, depending on fontsize), and a scrollback buffer of 5000 lines, I am also used to scrolling through long jupyter notebooks. Scrolling through your examples might be tedious (not for me actually), but I will never be able to find typos hidden in the three dots.

@max-sixty No worries, I understand that this is a minor cosmetic issue, actually I intended it as a feature request, not a bug. But that must have gone missing along the way. I guess I could live with 50, any other opinions? I am sure someone else will complain about that too. ;)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
870396123 https://github.com/pydata/xarray/issues/5545#issuecomment-870396123 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg3MDM5NjEyMw== st-bender 28786187 2021-06-29T08:36:04Z 2021-06-29T08:36:04Z CONTRIBUTOR

Hi @max-sixty

We need to cut some of the output, given a dataset has arbitrary size — same as numpy arrays / pandas dataframes.

I thought about that too, but I believe these cases are slightly different. In numpy arrays you can almost guess how the full array looks like, you know the shape and get an impression of the magnitude of the entries (of course there can be exceptions which are not shown in the output). Similar for pandas series or dataframes, the skipped index values are quite easy to guess. The names of data variables in a dataset are almost impossible to guess, as are their dimensions and data types. The ellipsis is usually used to indicate some kind of continuation, which is not really the case with the data variables.

If people feel strongly about a default > 12, that seems reasonable. Do people?

I can't speak for other people, but I do, sorry about that. @shoyer 's suggestion sounds good to me, from the top of my head 30-100 variables in a dataset seems to be around what I have come across as a typical case. Which does not mean that it is the typical case.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
869950924 https://github.com/pydata/xarray/issues/5545#issuecomment-869950924 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg2OTk1MDkyNA== st-bender 28786187 2021-06-28T19:12:43Z 2021-06-28T19:12:43Z CONTRIBUTOR

I switched off html rendering altogether because that really slows down the browser, haven't had any problems with the text output. The text output is (was) also much more concise and does not require additional clicks to open the dataset and see which variables are in there.

The problem with your suggestion is that this approach is not backwards compatible, which is not nice towards long-term users. A larger default would be a bit like meeting half-way. I also respectfully disagree about the purpose of __repr__(), see for example https://docs.python.org/3/reference/datamodel.html#object.repr . Cutting the output arbitrarily does not allow one to "recreate the object".

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247
869726359 https://github.com/pydata/xarray/issues/5545#issuecomment-869726359 https://api.github.com/repos/pydata/xarray/issues/5545 MDEyOklzc3VlQ29tbWVudDg2OTcyNjM1OQ== st-bender 28786187 2021-06-28T14:19:01Z 2021-06-28T14:19:01Z CONTRIBUTOR

Why not increase that number to a more sensible value (as I suggested), or make it optional if people have problems? If people are concerned and have problems, then this would be an option to fix that, not the other way around. This enforces such a low limit onto all others.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Increase default `display_max_rows` 931591247

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.662ms · About: xarray-datasette