home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 187069161 and user = 4160723 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • benbovy · 4 ✖

issue 1

  • MultiIndex serialization to NetCDF · 4 ✖

author_association 1

  • MEMBER 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
260645000 https://github.com/pydata/xarray/issues/1077#issuecomment-260645000 https://api.github.com/repos/pydata/xarray/issues/1077 MDEyOklzc3VlQ29tbWVudDI2MDY0NTAwMA== benbovy 4160723 2016-11-15T13:46:38Z 2016-11-15T14:33:08Z MEMBER

Yes I'm actually not very happy with the .dataset attribute for accessing the underlying dataset object. On the other hand, similarly to h5py and netCDF4, I find it nice to have dict-like access to other nodes of the tree, e.g., dsnode['../othernode/childnode']. I guess this might co-exist with dict-like access to dataset variables if we ensure that there is no conflict between the names of the child nodes and the names of the dataset variables. Or maybe we can still access a child node that have the same name than a variable by writing dsnode['./name'] instead of dsnode['name']. Conflicts would remain for attribute-style access anyway...

@shoyer do you think that a PR for such a DatasetNode class has any chance of being merged at some point here?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex serialization to NetCDF 187069161
260162320 https://github.com/pydata/xarray/issues/1077#issuecomment-260162320 https://api.github.com/repos/pydata/xarray/issues/1077 MDEyOklzc3VlQ29tbWVudDI2MDE2MjMyMA== benbovy 4160723 2016-11-13T02:24:56Z 2016-11-13T02:24:56Z MEMBER

I've started writing a DatasetNode class (WIP): https://gist.github.com/benbovy/92e7c76220af1aaa4b3a0b65374e233a

Currently, this is a minimal class that just implements an "immutable" tree of datasets (it only allows adding child nodes so that we can build a tree).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex serialization to NetCDF 187069161
259412395 https://github.com/pydata/xarray/issues/1077#issuecomment-259412395 https://api.github.com/repos/pydata/xarray/issues/1077 MDEyOklzc3VlQ29tbWVudDI1OTQxMjM5NQ== benbovy 4160723 2016-11-09T13:18:09Z 2016-11-09T14:31:50Z MEMBER

unless we want options for controlling how the MultiIndex is stored.

Yes that's what I mean, something like categories_codes, raw_values and/or hybrid options, though I don't know if using encoding is appropriate here.

Trying to summarize the potential use cases mentioned above: 1. If we're sure that we'll only use xarray (current or newer version) to load back the files, then the categories_codes option is the way to go. 2. If we want to write files that are portable across many other tools than just xarray, then we could use reset_index to manually switch the multi-index back into separate coordinates before writing the file. 3. If we want both 1 and 2, then it would be convenient to have something in xarray that automatically resets / refactorizes the multi-index at writing / loading (this would be the hybrid option).

Note that point 3 is just for more convenience, I wouldn't mind too much having to manually reset / refactorize the multi-index in that case. We indeed don't need options if point 3 is not important.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex serialization to NetCDF 187069161
258476068 https://github.com/pydata/xarray/issues/1077#issuecomment-258476068 https://api.github.com/repos/pydata/xarray/issues/1077 MDEyOklzc3VlQ29tbWVudDI1ODQ3NjA2OA== benbovy 4160723 2016-11-04T16:14:59Z 2016-11-04T16:14:59Z MEMBER

I have the exact same applications than yours @tippetts, but I also would like to write netCDF files that are compatible with other tools than just xarray. With the category encoded values as the default behavior, my concern is that xarray users may be unaware that they generate netCDF files which have limited compatibility with 3rd-party tools, unless a clear warning is given in the documentation.

One consideration in favor of this is that it will soon be very easy to switch a MultiIndex back into separate coordinate variables, which could be our recommendation for how to save netCDF files for maximum portability.

This should be fine, but maybe it would be nice to allow handling this automatically (at read and write) by using a specific encoding attribute? I haven't got much into xarray's IO and serialization logic, so I don't know if it is the right approach. This would be convenient for loading back the generated netCDF files with both xarray and 3rd-party tools, though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  MultiIndex serialization to NetCDF 187069161

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 33.989ms · About: xarray-datasette