home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 685613931

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
685613931 MDU6SXNzdWU2ODU2MTM5MzE= 4374 Xarray improper handling of Fillvalues when converting to Numpy array 1478822 closed 0     2 2020-08-25T16:11:53Z 2020-08-25T16:39:34Z 2020-08-25T16:39:34Z NONE      

I use xarray to access some geospatial data from an opendap server using: xrObject = xr.open_dataset(opendapUrl) The problem is that when I need to pass this data to Numpy, since GDAL only accepts Numpy arrays, xarray passes NaNs and not the proper fill value which is listed in the attributes. This to me is a bug. Apologies for not providing a code snippet but here is the print out:

I'm running this using python 3.8.3 on a Centos 6 machine using xarray 0.16.0 and netcdf4 version 1.5.3

** Results *** ``` Opening file: https://opendapserver/2001/001/3B-HHR-E.HDF5 Access of opendap metadata successful Max: <xarray.DataArray 'precipitationCal' ()> array(102.84, dtype=float32) Min: <xarray.DataArray 'precipitationCal' ()> array(0., dtype=float32) <xarray.DataArray 'precipitationCal' (time: 1, lat: 1800, lon: 3600)> array([[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], dtype=float32) Coordinates: * lat (lat) float32 -89.95 -89.85 -89.75 -89.65 ... 89.75 89.85 89.95 * lon (lon) float32 -179.95 -179.85 -179.75 ... 179.75 179.85 179.95 * time (time) object 2001-01-01 00:00:00 Attributes: DimensionNames: time,lat,lon Units: mm/hr units: mm/hr CodeMissingValue: -9999.9 origname: precipitationCal fullnamepath: /Grid/precipitationCal OldDimensionNames: Dataset was transposed.

Print Numpy array after command: dataSlice = np.array(xrArray.squeeze()); print(dataSlice)

[[nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] ... [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan]] ```

So the problem is that Xarray is converting to a Numpy array and not filling in the fill values from the CodeMissingValue attribute. It is also not very easy to print the xarray raw data in a similar manner Numpy masked array does using the filled() method. I think the developers can agree that it is a good thing for xarray to pass the data to another structure as it received it, i.e., without replacing the fill values.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/4374/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.603ms · About: xarray-datasette