home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 1432388736 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 2

  • hmaarrfk 4
  • dcherian 1

author_association 2

  • CONTRIBUTOR 4
  • MEMBER 1

issue 1

  • coordinates not removed for variable encoding during reset_coords · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1383192335 https://github.com/pydata/xarray/issues/7245#issuecomment-1383192335 https://api.github.com/repos/pydata/xarray/issues/7245 IC_kwDOAMm_X85ScdcP hmaarrfk 90008 2023-01-15T16:23:15Z 2023-01-15T16:23:15Z CONTRIBUTOR

Thank you for your explination.

Do you think it is safe to "strip" encoding after "loading" the data? or is it still used after the initial call to open_dataset?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coordinates not removed for variable encoding during reset_coords 1432388736
1383183939 https://github.com/pydata/xarray/issues/7245#issuecomment-1383183939 https://api.github.com/repos/pydata/xarray/issues/7245 IC_kwDOAMm_X85ScbZD dcherian 2448579 2023-01-15T15:45:19Z 2023-01-15T15:45:19Z MEMBER

This is another motivating reason for #5082. It's too hard to keep attrs or encoding in sync given Xarray's data model.

Since encoding is frequently out-of-date, it just causes a lot of problems. In general, the advice is to manually set encoding if you care about how your dataset is written to disk.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coordinates not removed for variable encoding during reset_coords 1432388736
1369001951 https://github.com/pydata/xarray/issues/7245#issuecomment-1369001951 https://api.github.com/repos/pydata/xarray/issues/7245 IC_kwDOAMm_X85RmU_f hmaarrfk 90008 2023-01-02T14:41:45Z 2023-01-02T14:41:45Z CONTRIBUTOR

Kind bump

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coordinates not removed for variable encoding during reset_coords 1432388736
1300527716 https://github.com/pydata/xarray/issues/7245#issuecomment-1300527716 https://api.github.com/repos/pydata/xarray/issues/7245 IC_kwDOAMm_X85NhHpk hmaarrfk 90008 2022-11-02T14:27:04Z 2022-11-02T14:27:04Z CONTRIBUTOR

While the above "fix" addresses the issues with renaming coordinates, I think there are plenty of usecases where we would still end up with strange, or unexpected results. For example.

  1. Load a dataset with many non-indexing coordinates.
  2. Dropping variables (that happen to be coordinates).
  3. Then adding back a variable with the same name.
  4. Upon save, encoding would dictate that it is a coordinate of a particular variable and will promote it to a coordinate instead of data.

We could apply the "fix" to the drop_vars method as well, but I think it may be hard (though not impossible) to hit all the cases.

I think a more "generic", albeit breaking" fix would be to remove the "coordinates" entirely from encoding after the dataset has been loaded. That said, this only "works" if dataset['variable_name'].encoding['coordinates'] is considered a private variable. That is, users are not supposed to be adding to it at will.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coordinates not removed for variable encoding during reset_coords 1432388736
1299492524 https://github.com/pydata/xarray/issues/7245#issuecomment-1299492524 https://api.github.com/repos/pydata/xarray/issues/7245 IC_kwDOAMm_X85NdK6s hmaarrfk 90008 2022-11-02T02:49:58Z 2022-11-02T02:57:37Z CONTRIBUTOR

And if you want to have a clean encoding dictionary, you may want to do the following:

python names = set(names) for _, variable in obj._variables.items(): if 'coordinates' in variable.encoding: coords_in_encoding = set(variable.encoding.get('coordinates').split(' ')) remaining_coords = coords_in_encoding - names if len(remaining_coords) == 0: del variable.encoding['coordinates'] else: variable.encoding['coordinates'] = ' '.join(remaining_coords)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  coordinates not removed for variable encoding during reset_coords 1432388736

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 564.174ms · About: xarray-datasette