home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where issue = 1200716594 and user = 62192187 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • cisaacstern · 4 ✖

issue 1

  • Fix zarr append dtype checks · 4 ✖

author_association 1

  • CONTRIBUTOR 4
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1124068133 https://github.com/pydata/xarray/pull/6476#issuecomment-1124068133 https://api.github.com/repos/pydata/xarray/issues/6476 IC_kwDOAMm_X85C_-sl cisaacstern 62192187 2022-05-11T17:39:42Z 2022-05-11T17:39:42Z CONTRIBUTOR

Thanks all for your mentorship on this! Excited to continue contributing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fix zarr append dtype checks 1200716594
1122850296 https://github.com/pydata/xarray/pull/6476#issuecomment-1122850296 https://api.github.com/repos/pydata/xarray/issues/6476 IC_kwDOAMm_X85C7VX4 cisaacstern 62192187 2022-05-10T20:54:05Z 2022-05-10T20:55:46Z CONTRIBUTOR

@shoyer just wanted to chime in with a bump to say that I'll greatly appreciate your review of this PR when you get a moment.

This fix is currently the only blocker for https://github.com/pangeo-forge/staged-recipes/issues/120. (I've just confirmed today that installing xarray from this PR branch resolves the error there, as detailed in bullets 2-3 of https://github.com/pangeo-forge/staged-recipes/issues/120#issuecomment-1122846802.)

I'm sure you're quite busy, so just want to emphasize how much I appreciate your attention to this, whenever you get a chance.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fix zarr append dtype checks 1200716594
1097408858 https://github.com/pydata/xarray/pull/6476#issuecomment-1097408858 https://api.github.com/repos/pydata/xarray/issues/6476 IC_kwDOAMm_X85BaSFa cisaacstern 62192187 2022-04-13T00:08:33Z 2022-04-13T19:59:32Z CONTRIBUTOR

Here's a first pass at a solution for #6345. This is a new area for me, so I certainly look forward to feedback from those with more experience.

The issue identified in #6345 was lack of support for append-mode Zarr writes for variables with dtype "|S*" where * is a positive integer. IIUC (but I may not), this datatype represents fixed-length strings, where * is the length in characters.

The MRE at the top of the linked issue demonstrated this problem for the case in which a new variable is being added, to which @shoyer replied in https://github.com/pydata/xarray/issues/6345#issuecomment-1065381346:

I think the original issue was that appending a fixed-width string could be a problem if the fixed-width does not match the width of the existing string dtype stored in Zarr. ... This obviously doesn't apply in this case, because you are adding an entirely new variable. So I guess the check could be removed in that case.

In this comment, Stephan also links to the motivating case for the datatype check, which was to prevent truncation of strings of a greater maximum length (e.g. <U5) when appended to an existing array of strings of a smaller maximum length (e.g. <U2).

In sum, it seems to me that the requirements for a fix to these datatype checks are as follows:

  1. Support append of new variables of dtype |S* and <U* regardless of their length value (represented by * here)
  2. Support append to existing variables of dtype |S* and <U*, if dtype of the variable to append matches existing dtype exactly (i.e. allow appending |S2 to |S2, <U3 to <U3, etc.)
  3. Raise an exception only when the user attempts to append length-specified string data (of type |S* or <U*) to an existing array of data with a different datatype (i.e. appending |S3 to |S2, <U5 to <U2, etc. is not allowed)

This PR accomplishes this by leaving the initial checks largely intact, with the following adjustments:

| Existing checks | This PR | | --------------- | -------- | | Raise an error in every case if the datatype of the variable to append is not known to easily be appended (e.g. floats, etc.) | If the datatype of the variable to append is not known to be easy to append, only raise an error if its datatype does not exactly match the datatype of the corresponding variable in the existing store | | Opinionated about datatype regardless of whether the variable is present or not in the existing store | Permissive of all datatypes if the user is adding an entirely new variable (because no potential incompatibility in this case) | | Assumes variable will be easy to append if coding.strings.is_unicode_dtype evaluates to True | Removes this assumption, because it turns out that xr.coding.strings.is_unicode_dtype(np.dtype("<U5")) evaluates to True, and this is one of the cases where we want to ensure exact length equality. |

I've incorporated coverage for the above-listed requirements into the test suite. The way the datatype validation is now written, any datatype (not just |S* and <U*) that is not known to be easy-to-append will pass the check only if its type matches the datatype of the corresponding variable in the existing store. I'm not aware of what specific other datatypes may fall into this category, but assume that if they are problematic, an error will be raised eventually at the Zarr level.

Thank you in advance to reviewers of my first xarray PR. (Noting also that it looks like the docs build failure is common to other current PRs, and not specific to this PR.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fix zarr append dtype checks 1200716594
1095736626 https://github.com/pydata/xarray/pull/6476#issuecomment-1095736626 https://api.github.com/repos/pydata/xarray/issues/6476 IC_kwDOAMm_X85BT50y cisaacstern 62192187 2022-04-12T00:35:20Z 2022-04-12T00:35:20Z CONTRIBUTOR

This WIP should close https://github.com/pydata/xarray/issues/6345 when complete. I'll mark as Ready for review along with an explanatory comment when it's ready.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Fix zarr append dtype checks 1200716594

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 79.897ms · About: xarray-datasette