home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "MEMBER" and issue = 1373352524 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • rabernat 4

issue 1

  • Encoding error when saving netcdf · 4 ✖

author_association 1

  • MEMBER · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1460185069 https://github.com/pydata/xarray/issues/7039#issuecomment-1460185069 https://api.github.com/repos/pydata/xarray/issues/7039 IC_kwDOAMm_X85XCKft rabernat 1197350 2023-03-08T13:51:06Z 2023-03-08T13:51:06Z MEMBER

Rather than using the scale_factor and add_offset approach, I would look into xbitinfo if you want to optimize your compression.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding error when saving netcdf 1373352524
1248302788 https://github.com/pydata/xarray/issues/7039#issuecomment-1248302788 https://api.github.com/repos/pydata/xarray/issues/7039 IC_kwDOAMm_X85KZ5bE rabernat 1197350 2022-09-15T16:02:17Z 2022-09-15T16:02:17Z MEMBER

I am curious as to what exactly from the encoding introduces the noise (I still need to read through the documentation more thoroughly)?

The encoding says that your data should be encoded according to the following pseudocode formula: encoded = int((original - offset) / scale_factor) decoded = (scale_factor * float(encoded)) + offset

So the floating-point data are converted back and forth to a less precise type (integer) in order to save space. These numerical operations cannot preserve exact floating point accuracy. That's just how numerical float-point operations work. If you skip the encoding, then you just write the floating point bytes directly to disk, with no loss of precision.

This sort of encoding a crude form of lossy compression that is still unfortunately in use, even though there are much better algorithms available (and built into netcdf and zarr). Differences on the order of 10^-14 should not affect any real-world calculations.

However, this seems like a much, much smaller difference than the problem you originally reported. This suggests that the MRE does not actually reproduce the bug after all. How was the plot above (https://github.com/pydata/xarray/issues/7039#issue-1373352524) generated? From your actual MRE code? Or from your earlier example with real data?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding error when saving netcdf 1373352524
1248241823 https://github.com/pydata/xarray/issues/7039#issuecomment-1248241823 https://api.github.com/repos/pydata/xarray/issues/7039 IC_kwDOAMm_X85KZqif rabernat 1197350 2022-09-15T15:12:34Z 2022-09-15T15:12:34Z MEMBER

I'm puzzled that I was not able to reproduce this error. I modified the end slightly as follows

```python

save dataset as netcdf

ds.to_netcdf("test.nc")

load saved dataset

ds_test = xr.open_dataset('test.nc')

verify that the two are equal within numerical precision

xr.testing.assert_allclose(ds, ds_test)

plot

plt.plot(ds.t2m - ds_test.t2m) ```

In my case, the differences were just numerical noise (order 10^-14)

I used the binder environment for this.

I'm pretty stumped.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding error when saving netcdf 1373352524
1248098918 https://github.com/pydata/xarray/issues/7039#issuecomment-1248098918 https://api.github.com/repos/pydata/xarray/issues/7039 IC_kwDOAMm_X85KZHpm rabernat 1197350 2022-09-15T13:25:11Z 2022-09-15T13:25:11Z MEMBER

Thanks so much for taking the time to write up this detailed bug report! 🙏

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Encoding error when saving netcdf 1373352524

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.487ms · About: xarray-datasette