issues: 685613931
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
685613931 | MDU6SXNzdWU2ODU2MTM5MzE= | 4374 | Xarray improper handling of Fillvalues when converting to Numpy array | 1478822 | closed | 0 | 2 | 2020-08-25T16:11:53Z | 2020-08-25T16:39:34Z | 2020-08-25T16:39:34Z | NONE | I use xarray to access some geospatial data from an opendap server using: xrObject = xr.open_dataset(opendapUrl) The problem is that when I need to pass this data to Numpy, since GDAL only accepts Numpy arrays, xarray passes NaNs and not the proper fill value which is listed in the attributes. This to me is a bug. Apologies for not providing a code snippet but here is the print out: I'm running this using python 3.8.3 on a Centos 6 machine using xarray 0.16.0 and netcdf4 version 1.5.3 ** Results *** ``` Opening file: https://opendapserver/2001/001/3B-HHR-E.HDF5 Access of opendap metadata successful Max: <xarray.DataArray 'precipitationCal' ()> array(102.84, dtype=float32) Min: <xarray.DataArray 'precipitationCal' ()> array(0., dtype=float32) <xarray.DataArray 'precipitationCal' (time: 1, lat: 1800, lon: 3600)> array([[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], dtype=float32) Coordinates: * lat (lat) float32 -89.95 -89.85 -89.75 -89.65 ... 89.75 89.85 89.95 * lon (lon) float32 -179.95 -179.85 -179.75 ... 179.75 179.85 179.95 * time (time) object 2001-01-01 00:00:00 Attributes: DimensionNames: time,lat,lon Units: mm/hr units: mm/hr CodeMissingValue: -9999.9 origname: precipitationCal fullnamepath: /Grid/precipitationCal OldDimensionNames: Dataset was transposed. Print Numpy array after command: dataSlice = np.array(xrArray.squeeze()); print(dataSlice) [[nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] ... [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan] [nan nan nan ... nan nan nan]] ``` So the problem is that Xarray is converting to a Numpy array and not filling in the fill values from the CodeMissingValue attribute. It is also not very easy to print the xarray raw data in a similar manner Numpy masked array does using the filled() method. I think the developers can agree that it is a good thing for xarray to pass the data to another structure as it received it, i.e., without replacing the fill values. |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/4374/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |