home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where author_association = "MEMBER" and issue = 1077079208 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • shoyer 1
  • dcherian 1

issue 1

  • to_zarr: region not recognised as dataset dimensions · 2 ✖

author_association 1

  • MEMBER · 2 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1059403646 https://github.com/pydata/xarray/issues/6069#issuecomment-1059403646 https://api.github.com/repos/pydata/xarray/issues/6069 IC_kwDOAMm_X84_JTd- dcherian 2448579 2022-03-04T18:14:18Z 2022-03-04T18:14:18Z MEMBER

:+1: to creating a new issue with your minimal example (I think we're just missing a check whether the Dataset and on-disk fill values are equal). It did seem like there were two issues mixed up here. Thanks for confirming that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  to_zarr: region not recognised as dataset dimensions 1077079208
1034196986 https://github.com/pydata/xarray/issues/6069#issuecomment-1034196986 https://api.github.com/repos/pydata/xarray/issues/6069 IC_kwDOAMm_X849pJf6 shoyer 1217238 2022-02-09T21:12:31Z 2022-02-09T21:12:31Z MEMBER

The reason why this isn't allowed is because it's ambiguous what to do with the other variables that are not restricted to the region (['cell', 'face', 'layer', 'max_cell_node', 'max_face_nodes', 'node', 'siglay'] in this case).

I can imagine quite a few different ways this behavior could be implemented:

  1. Ignore these variables entirely.
  2. Ignore variables if they also already exist, but write new ones.
  3. Write or overwrite both new and existing these variables.
  4. Write new variables. Ignore existing variables only if they already exist with the same values, and if not, raise an error.

I believe your proposal here (removing these checks from _validate_region) would achieve (3), but I'm not sure that's the best option.

(4) seems like perhaps the most user-friendly option, but checking existing variables can add significant overhead. When experimenting adding region support Xarray-Beam, I found many cases where it was easy to inadvertently make large parallel pipelines much slower by downloaded existing variables.

The current solution is not to do any of these, and to force the user to make an explicit choice by dropping new variables, or write them in a separate call to to_zarr. I think it would also be OK to let a user explicitly opt-in to one of these behaviors, but I don't think guessing what the user wants would be ideal.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  to_zarr: region not recognised as dataset dimensions 1077079208

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.894ms · About: xarray-datasette