home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER", issue = 187859705 and user = 4160723 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • benbovy · 4 ✖

issue 1

  • Dataset groups · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
290256766 https://github.com/pydata/xarray/issues/1092#issuecomment-290256766 https://api.github.com/repos/pydata/xarray/issues/1092 MDEyOklzc3VlQ29tbWVudDI5MDI1Njc2Ng== benbovy 4160723 2017-03-29T23:26:50Z 2017-03-29T23:26:50Z MEMBER

How to handle dimensions and coordinate names when assigning groups is clearly one of the important design decisions here. It's obvious that data variables should be grouped but less clear how to handle dimensions/coordinates.

I would be +1 for allowing tuples for data variables names but not for dimensions/coordinates names. It indeed looks like that using tuples for the latter would be a greater source of confusion and would add too much complexity for only little (or no real?) benefit.

I'd be fine with raising an error when loading a netCDF4 file which have groups with conflicting dimensions or when assigning an incompatible Dataset as a new group (e.g., ds['flux'] = incompatible_ds).

For groups that share common dimensions/coordinates with some differences, a data structure built on top of Dataset (like DatasetGroup or DatasetNode) would be more appropriate I think.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset groups 187859705
290130241 https://github.com/pydata/xarray/issues/1092#issuecomment-290130241 https://api.github.com/repos/pydata/xarray/issues/1092 MDEyOklzc3VlQ29tbWVudDI5MDEzMDI0MQ== benbovy 4160723 2017-03-29T15:38:46Z 2017-03-29T15:38:46Z MEMBER

@darothen you might be interested by the discussion we had here, although it doesn't solve anything related to selection across similar Dataset objects.

I think that the collection of Dataset objects with like-dimensions that you suggest is indeed different than the tree-like structure within a dataset that is proposed here (the latter still using a unique set of dimensions and coordinates).

Both approaches may co-exist, though. I can imagine the case where we have (1) a set of, e.g., grid-search or monte-carlo model runs and (2) for each model run we have diagnostic variables defined in different places on the grid (e.g., nodes, edges...). The tuple-defined groups within a Dataset is useful for 2 and the collection of Dataset objects is useful for 1.

As pointed out by @shoyer, such a collection of Dataset objects might be (preferably) implemented outside of xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset groups 187859705
290065632 https://github.com/pydata/xarray/issues/1092#issuecomment-290065632 https://api.github.com/repos/pydata/xarray/issues/1092 MDEyOklzc3VlQ29tbWVudDI5MDA2NTYzMg== benbovy 4160723 2017-03-29T11:48:12Z 2017-03-29T11:48:12Z MEMBER

Just want to say that I'm very enthusiastic about this!

Like @lamorton, I also find myself having a lot of variables with names containing the name(s) of their "group(s)".

My initial idea was also to keep flat datasets and add some logic to get/set groups, but it wasn't very clear and well explained.

One important reason to keep the tree-like structure within a dataset is that it provides some assurance to the recipient of the dataset that all the variables 'belong' in the same coordinate space.

Makes perfect sense!

I also find the idea of using tuples very clever! @shoyer do you have an idea on how it would work with serialization to netCDF? We would also have to decide how to display groups in the repr of the flat dataset...

@lamorton @shoyer unless you want to open a PR, I'd be willing to start working on this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset groups 187859705
259390660 https://github.com/pydata/xarray/issues/1092#issuecomment-259390660 https://api.github.com/repos/pydata/xarray/issues/1092 MDEyOklzc3VlQ29tbWVudDI1OTM5MDY2MA== benbovy 4160723 2016-11-09T11:15:01Z 2016-11-09T11:24:51Z MEMBER

For example, how do groups get updated when you slice, aggregate or concatenate datasets?

Yep once again I haven't thought about all the implications this would have! This would indeed add much complexity at the end.

I'll try to follow you suggestion of building another data structure, for example - correct me if it's a wrong approach too - a DatasetGroup class which would be very similar to netCDF4.Group or h5py.Group but which would here contain a single Dataset.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Dataset groups 187859705

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 2486.716ms · About: xarray-datasette