home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "NONE", issue = 942738904 and user = 1373406 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • ohsqueezy · 5 ✖

issue 1

  • Decoding netCDF is giving incorrect values for a large file · 5 ✖

author_association 1

  • NONE · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
879554360 https://github.com/pydata/xarray/issues/5597#issuecomment-879554360 https://api.github.com/repos/pydata/xarray/issues/5597 MDEyOklzc3VlQ29tbWVudDg3OTU1NDM2MA== ohsqueezy 1373406 2021-07-14T03:19:53Z 2021-07-14T03:19:53Z NONE

That explains it to me! Not sure if it's still useful but I exported the subset as a netCDF file. ```python In [59]: packed_vals = xarray.open_dataset("packed_solar_data_subset.nc", mask_and_scale=False).ssrd.values

In [60]: packed_vals[0] * numpy.float32(e["scale_factor"]) + numpy.float32(e["add_offset"])
Out[60]: 2.0

In [61]: packed_vals[0] * numpy.float64(e["scale_factor"]) + numpy.float64(e["add_offset"])
Out[61]: 0.0 Hm actually I think converting the packed vals to 64 bit and then decoding does what I'm looking forpython In [62]: xarray.decode_cf(xarray.open_dataset("packed_solar_data_subset.nc", mask_and_scale=False).astype(numpy.float64)).ssrd.values
Out[62]: array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 25651.61906215, 354743.1221522 , 1091757.933255 , 2170377.23235622, 3482363.69999847, 4704882.32554591, 5689654.23783437, 6297785.304381 , 6534906.36839455, 6543665.4578304 , 6543665.4578304 ]) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding netCDF is giving incorrect values for a large file 942738904
879282134 https://github.com/pydata/xarray/issues/5597#issuecomment-879282134 https://api.github.com/repos/pydata/xarray/issues/5597 MDEyOklzc3VlQ29tbWVudDg3OTI4MjEzNA== ohsqueezy 1373406 2021-07-13T17:49:09Z 2021-07-13T17:49:09Z NONE

sure, no prob python $ xarray.open_dataset("BIG_FILE_packed.nc").ssrd.encoding {'source': 'BIG_FILE_packed.nc', 'original_shape': (743, 1801, 3600), 'dtype': dtype('int16'), 'missing_value': -32767, '_FillValue': -32767, 'scale_factor': 625.6492454183389, 'add_offset': 20500023.17537729}

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding netCDF is giving incorrect values for a large file 942738904
878920991 https://github.com/pydata/xarray/issues/5597#issuecomment-878920991 https://api.github.com/repos/pydata/xarray/issues/5597 MDEyOklzc3VlQ29tbWVudDg3ODkyMDk5MQ== ohsqueezy 1373406 2021-07-13T09:16:03Z 2021-07-13T09:16:03Z NONE

h5netcdf seems to be a separate issue for me as it gives me the error OSError: Unable to open file (file signature not found) I looked into it once though, and I think I might be able to fix that. I'll also see if I can build a small netCDF that has reproducible behavior!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding netCDF is giving incorrect values for a large file 942738904
878910801 https://github.com/pydata/xarray/issues/5597#issuecomment-878910801 https://api.github.com/repos/pydata/xarray/issues/5597 MDEyOklzc3VlQ29tbWVudDg3ODkxMDgwMQ== ohsqueezy 1373406 2021-07-13T09:03:04Z 2021-07-13T09:03:04Z NONE

Thanks for your help!

I checked using the netCDF4 module, and the data is returned correctly ```python $ d = netCDF4.Dataset("BIG_FILE_packed.nc") $ d["ssrd"][d["time"][:] < d["time"][24], d["latitude"][:] == 44.8, d["longitude"][:] == 287.1]

masked_array( data=[[[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 0. ]], [[ 25651.61906215]], [[ 354743.1221522 ]], [[1091757.933255 ]], [[2170377.23235622]], [[3482363.69999847]], [[4704882.32554591]], [[5689654.23783437]], [[6297785.304381 ]], [[6534906.36839455]], [[6543665.4578304 ]], [[6543665.4578304 ]], [[6543665.4578304 ]]], mask=False, fill_value=1e+20) I tried with `scipy` as the engine, and it still returns the 2 valuespython $ xarray.open_dataset("BIG_FILE_packed.nc", engine="scipy").ssrd.sel(latitude=44.8, longitude=287.1, method="nearest").values[:23]
array([2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.565200e+04, 3.547440e+05, 1.091760e+06, 2.170378e+06, 3.482364e+06, 4.704884e+06, 5.689655e+06, 6.297786e+06, 6.534908e+06, 6.543667e+06, 6.543667e+06], dtype=float32) ``` I should mention that in another large packed dataset from this API, I have gotten the same error but with a very small decimal value in place of the zero instead of 2.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding netCDF is giving incorrect values for a large file 942738904
878824626 https://github.com/pydata/xarray/issues/5597#issuecomment-878824626 https://api.github.com/repos/pydata/xarray/issues/5597 MDEyOklzc3VlQ29tbWVudDg3ODgyNDYyNg== ohsqueezy 1373406 2021-07-13T06:46:55Z 2021-07-13T06:46:55Z NONE

That example is actually a different file than the original. I unpacked the original file externally using ncpdq -U BIG_FILE_packed.nc BIG_FILE_unpacked.nc before opening it with xarray, so the decoding step is skipped and there aren't any 2 values generated. The data is correct using that method, so it's a possible workaround, but unpacking externally makes each file 4x larger.

In all the examples, the data is the same time and location, so they should be the same values outside of whatever is lost from compressing to int16 and decompressing, and the output arrays are from selecting a single day (24 hours) at a single location from the dataset returned by open_dataset in the ipython interpreter.

So actually there are three files I've tested with, all of which should have the same data (assuming the issue isn't with how the files are built, which could be the case): BIG_FILE_packed.nc BIG_FILE_unpacked.nc and SMALL_FILE_packed.nc, and the only one that displays the issue is the first one.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Decoding netCDF is giving incorrect values for a large file 942738904

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 318.061ms · About: xarray-datasette