home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER", issue = 1643408278 and user = 5821660 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • kmuehlbauer · 5 ✖

issue 1

  • `nan` values appearing when saving and loading from `netCDF` due to encoding · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1492078304 https://github.com/pydata/xarray/issues/7691#issuecomment-1492078304 https://api.github.com/repos/pydata/xarray/issues/7691 IC_kwDOAMm_X85Y707g kmuehlbauer 5821660 2023-03-31T15:05:17Z 2023-03-31T15:05:17Z MEMBER

, the PR seems to solve my specific issue without changing the encoding

Great, thanks for testing.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `nan` values appearing when saving and loading from `netCDF` due to encoding 1643408278
1491915288 https://github.com/pydata/xarray/issues/7691#issuecomment-1491915288 https://api.github.com/repos/pydata/xarray/issues/7691 IC_kwDOAMm_X85Y7NIY kmuehlbauer 5821660 2023-03-31T13:19:01Z 2023-03-31T13:19:01Z MEMBER

@euronion There is a potential fix for your issue in #7654. It would be great, if you could have a closer look and test against that PR.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `nan` values appearing when saving and loading from `netCDF` due to encoding 1643408278
1486870845 https://github.com/pydata/xarray/issues/7691#issuecomment-1486870845 https://api.github.com/repos/pydata/xarray/issues/7691 IC_kwDOAMm_X85Yn9k9 kmuehlbauer 5821660 2023-03-28T13:16:31Z 2023-03-28T13:31:46Z MEMBER

MCVE:

python fname = "test-7691.nc" import netCDF4 as nc with nc.Dataset(fname, "w") as ds0: ds0.createDimension("t", 5) ds0.createVariable("x", "int16", ("t",), fill_value=-32767) v = ds0.variables["x"] v.set_auto_maskandscale(False) v.add_offset = 278.297319296597 v.scale_factor = 1.16753614203674e-05 v[:] = np.array([-32768, -32767, -32766, 32767, 0]) with nc.Dataset(fname) as ds1: x1 = ds1["x"][:] print("netCDF4-python:", x1.dtype, x1) with xr.open_dataset(fname) as ds2: x2 = ds2["x"].values ds2.to_netcdf("test-7691-01.nc") print("xarray first read:", x2.dtype, x2) with xr.open_dataset("test-7691-01.nc") as ds3: x3 = ds3["x"].values print("xarray roundtrip:", x3.dtype, x3)

python netCDF4-python: float64 [277.9147410535744 -- 277.9147644042972 278.67988586425815 278.297319296597] xarray first read: float32 [277.91476 nan 277.91476 278.6799 278.29733] xarray roundtrip: float32 [ nan nan nan 278.6799 278.29733] I've confirmed that correctly promoting to float64 in CFMaskCoder solves this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `nan` values appearing when saving and loading from `netCDF` due to encoding 1643408278
1486817329 https://github.com/pydata/xarray/issues/7691#issuecomment-1486817329 https://api.github.com/repos/pydata/xarray/issues/7691 IC_kwDOAMm_X85Ynwgx kmuehlbauer 5821660 2023-03-28T12:41:43Z 2023-03-28T12:41:43Z MEMBER

As this doesn't surface that often it might just happen here by accident. If the _FillValue/missing_value would be -32768 then the issue would not manifest.

So for NetCDF the default fillvalue for NC_SHORT (int16) is -32767. That means the promotion to float32 instead the needed float64 is the problem here (floating point precision).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `nan` values appearing when saving and loading from `netCDF` due to encoding 1643408278
1486532302 https://github.com/pydata/xarray/issues/7691#issuecomment-1486532302 https://api.github.com/repos/pydata/xarray/issues/7691 IC_kwDOAMm_X85Ymq7O kmuehlbauer 5821660 2023-03-28T09:37:58Z 2023-03-28T09:46:57Z MEMBER

Thanks for all the details, @euronion.

From what I can tell, everything is OK with the original file. It's using packed data: https://docs.unidata.ucar.edu/nug/current/best_practices.html#bp_Packed-Data-Values. The only thing what might be a bit off is why they didn't choose -32768 as _FillValue

As both scale_factor and add_offset are of dtype float64 in the original file the data should be unpacked to float64 according to NetCDF-specs.

The reason why this isn't done is because the

https://github.com/pydata/xarray/blob/020b4c07047189c5c788eca9e6e77d64b8989d58/xarray/conventions.py#L379-L384

CFMaskCoder will promote int16 to float32 unconditionally. This happens in dtypes.maybe_promote():

https://github.com/pydata/xarray/blob/020b4c07047189c5c788eca9e6e77d64b8989d58/xarray/core/dtypes.py#L67-L70

The CFScaleOffsetCoder itself is able to correctly convert this to the wanted dtype float64:

https://github.com/pydata/xarray/blob/e79eaf5acdcda62f27ce81f08e7e71839887d3d1/xarray/coding/variables.py#L235-L251

As this doesn't surface that often it might just happen here by accident. If the _FillValue/missing_value would be -32768 then the issue would not manifest.

Update: corrected to maybe_promote()

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  `nan` values appearing when saving and loading from `netCDF` due to encoding 1643408278

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 38.844ms · About: xarray-datasette