home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 942738904

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
942738904 MDU6SXNzdWU5NDI3Mzg5MDQ= 5597 Decoding netCDF is giving incorrect values for a large file 1373406 closed 0     15 2021-07-13T04:55:03Z 2024-03-15T16:31:06Z 2024-03-15T16:31:06Z NONE      

What happened:

0 value is decoded as 2

What you expected to happen:

Data encoded to -32766 should translate to 0

Minimal Complete Verifiable Example:

The first example is the base file I've been using, which is a 9GB packed netCDF. The first 12 values in this lookup should be 0 but are getting decoded as 2. ```python $ xarray.open_dataset("BIG_FILE_packed.nc").ssrd.isel(time=slice(0, 23)).sel(latitude=44.8, longitude=287.1, method="nearest").values

array([2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.000000e+00, 2.565200e+04, 3.547440e+05, 1.091760e+06, 2.170378e+06, 3.482364e+06, 4.704884e+06, 5.689655e+06, 6.297786e+06, 6.534908e+06, 6.543667e+06, 6.543667e+06], dtype=float32) This second example shows that if the file is decoded without automatic `mask_and_scale`, the value is decoded as 0 when applying the scale factor and add offset to an example value in the interpreter.python $ xarray.open_dataset("BIG_FILE_packed.nc", mask_and_scale=False).ssrd.isel(time=slice(0, 23)).sel(\ latitude=44.8, longitude=287.1, method="nearest").values array([-32766, -32766, -32766, -32766, -32766, -32766, -32766, -32766, -32766, -32766, -32766, -32766, -32725, -32199, -31021, -29297, -27200, -25246, -23672, -22700, -22321, -22307, -22307], dtype=int16)

$ xarray.open_dataset("BIG_FILE_packed.nc", mask_and_scale=False).ssrd.isel(time=slice(0, 23)).sel(\ latitude=44.8, longitude=287.1, method="nearest").values[0] * \ xarray.open_dataset("BIG_FILE_packed.nc").ssrd.encoding["scale_factor"] + xarray.open_dataset("BIG_FILE_packed.nc").ssrd.encoding["add_offset"]

0.0 When the netCDF is unpacked using the `nco` command line tool, the correct values are unpacked.python $ xarray.open_dataset("BIG_FILE_unpacked.nc").ssrd.isel(time=slice(0, 23)).sel(latitude=44.8, longitude=287.1, method="nearest").values array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 25651.61906215, 354743.1221522 , 1091757.933255 , 2170377.23235622, 3482363.69999847, 4704882.32554591, 5689654.23783437, 6297785.304381 , 6534906.36839455, 6543665.4578304 , 6543665.4578304 ]) Something else that may be relevant is that another file with this same packed data but as much smaller subset (1.7KB) of the big file is unpacked correctly.python $ xarray.open_dataset("SMALL_FILE_packed.nc").ssrd.isel(time=slice(0, 23)).sel(latitude=44.8, longitude=287.1, method="nearest").values array([ 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 0. , 25545.75, 354397.5 , 1091577. , 2170077. , 3482645.8 , 4704689. , 5689927. , 6297856.5 , 6535169. , 6543583. , 6543583. ], dtype=float32) ``` For this to be a real verifiable example, I can transfer the 9GB file to someone or give instructions on how to download it from the climate API I'm getting it from! I'm not sure if this is an issue with xarray or the API or something I'm doing wrong. I've mostly been using an older version of xarray, but I also tested on the most recent version available on PIP:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 (default, Jul 8 2021, 20:44:16) [GCC 5.4.0 20160609] python-bits: 64 OS: Linux OS-release: 4.4.0-200-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.18.2 pandas: 1.3.0 numpy: 1.21.0 scipy: None netCDF4: 1.5.7 pydap: None h5netcdf: 0.11.0 h5py: 3.3.0 Nio: None zarr: None cftime: 1.5.0 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: 0.9.9.0 iris: None bottleneck: None dask: None distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None pint: None setuptools: 51.3.3 pip: 20.3.3 conda: None pytest: None IPython: 7.25.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5597/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 13 rows from issue in issue_comments
Powered by Datasette · Queries took 78.672ms · About: xarray-datasette