home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where author_association = "MEMBER" and issue = 182638499 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 5

  • benbovy 2
  • fmaussion 2
  • shoyer 1
  • chris-b1 1
  • max-sixty 1

issue 1

  • Labeled repr · 7 ✖

author_association 1

  • MEMBER · 7 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
253477840 https://github.com/pydata/xarray/issues/1044#issuecomment-253477840 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzQ3Nzg0MA== benbovy 4160723 2016-10-13T10:37:24Z 2016-10-14T13:07:41Z MEMBER

After seeing the discussion in #680, I'm wondering if showing the firsts values of the flattened array wouldn't be enough here, e.g., something like this:

```

d <xarray.DataArray (a: 2, b: 5, c: 2, d: 10)> array int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ... Coordinates: * a (a) <U1 'A' 'B' * b (b) <U5 'Cat 1' 'Cat 2' 'Cat 3' 'Cat 4' 'Cat 5' * c (c) <U1 'J' 'K' * d (d) int64 0 1 2 3 4 5 6 7 8 9 ```

This example is more consistent with the repr of Dataset data variables, and similarly we could customize the repr of dask arrays and lazy arrays (loaded from netcdf files) like this:

```

d.chunk((10, 5, 5, 10)) <xarray.DataArray (a: 2, b: 5, c: 2, d: 10)> dask.array int64 chunksize=(10, 5, 5, 10) Coordinates: * a (a) <U1 'A' 'B' * b (b) <U5 'Cat 1' 'Cat 2' 'Cat 3' 'Cat 4' 'Cat 5' * c (c) <U1 'J' 'K' * d (d) int64 0 1 2 3 4 5 6 7 8 9 ```

```

d.name = 'myvar' d.to_netcdf('data.nc') xr.open_dataset('data.nc').myvar <xarray.DataArray 'myvar' (a: 2, b: 5, c: 2, d: 10)> lazy-array int64 Coordinates: * a (a) <U1 'A' 'B' * b (b) <U5 'Cat 1' 'Cat 2' 'Cat 3' 'Cat 4' 'Cat 5' * c (c) <U1 'J' 'K' * d (d) int64 0 1 2 3 4 5 6 7 8 9 ```

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253649762 https://github.com/pydata/xarray/issues/1044#issuecomment-253649762 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzY0OTc2Mg== benbovy 4160723 2016-10-13T21:52:46Z 2016-10-13T21:57:24Z MEMBER

In most cases I found the DataArray repr useful for quickly checking the dimensions (both names and sizes), the attributes and the types/values of both data and labels (I mean just checking here if the values are consistent regarding their units, acceptable ranges, etc.), but rarely for in-depth checking of the data values along each dimension, hence my suggestion of a flat (subset) array.

To inspect the data of high dimensional datarrays, I've mainly used the indexing logic of xarray to extract slices of <3 dimensions. However, I admit that for quick inspection purposes I actually like your suggestion of having a specific repr method that would allow showing small data slices as labeled tables, especially if we choose to always use a flat array for the repr of Dataarray (i.e., even when the number of dimensions <3). Why not something like:

``` python

d.slice_repr(a=0, b=0) d 0 1 2 3 4 5 6 7 8 9 c
J 0 1 2 3 4 5 6 7 8 9 K 10 11 12 13 14 15 16 17 18 19 ```

This is equivalent to

``` python

dslice = d.isel(a=0, b=0) pd.DataFrame(data=dslice.data, index=dslice.c, columns=dslice.d) ```

Except that slice_repr() would return a string instead of a data object (or an array or a dataframe). Not sure about the name and/or signature of slice_repr(), though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253624962 https://github.com/pydata/xarray/issues/1044#issuecomment-253624962 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzYyNDk2Mg== chris-b1 1924092 2016-10-13T20:11:38Z 2016-10-13T20:11:51Z MEMBER

There could be some display options exposed to manage this - for instance I personally would not like a flat array - but see how it could make sense.

Additionally / alternatively, the repr I'm talking (small slice of values laid out with coordinate labels) could called something other than __repr__ - something like pandas .head() although may be a better name to use here.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253566536 https://github.com/pydata/xarray/issues/1044#issuecomment-253566536 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzU2NjUzNg== fmaussion 10050469 2016-10-13T16:33:08Z 2016-10-13T16:33:08Z MEMBER

I agree, but I see one or two cases where it could be useful to have the first few values for each dim. For example with geopotential data on pressure levels, it could be interesting to see how the data varies with height on the third dim. But this is a detail, not very important.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253358859 https://github.com/pydata/xarray/issues/1044#issuecomment-253358859 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzM1ODg1OQ== max-sixty 5635139 2016-10-12T22:31:40Z 2016-10-12T22:31:40Z MEMBER

I think dupe of https://github.com/pydata/xarray/issues/680

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253347007 https://github.com/pydata/xarray/issues/1044#issuecomment-253347007 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzM0NzAwNw== fmaussion 10050469 2016-10-12T21:36:37Z 2016-10-12T21:36:47Z MEMBER

Good idea! I am in favor of as few repr as possible, i.e. maybe the first few values in each dimension.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499
253345903 https://github.com/pydata/xarray/issues/1044#issuecomment-253345903 https://api.github.com/repos/pydata/xarray/issues/1044 MDEyOklzc3VlQ29tbWVudDI1MzM0NTkwMw== shoyer 1217238 2016-10-12T21:31:58Z 2016-10-12T21:31:58Z MEMBER

Agreed, I'm never been really happy with our use of the NumPy repr for >2 dimensions. It's quite hard to match up the labels.

Something like this would be a meaningful improvement! I would encourage experimentation on this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Labeled repr 182638499

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 478.923ms · About: xarray-datasette