home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1055867960

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1055867960 I_kwDOAMm_X84-70Q4 5994 interpolate_na, x and y arrays must have at least 2 entries, warning instead of raise ? 11155537 closed 0     2 2021-11-17T08:54:43Z 2022-01-18T22:45:39Z 2022-01-18T22:45:39Z NONE      

Hi,

With interpolate_na, the following behavior looks like it is non-ideal : ```python import xarray as xr import numpy as np

Example of data showing interpolate_na bug

x = np.arange(5) y = np.arange(5)

data = np.array([[np.nan, np.nan, np.nan, np.nan, np.nan], [10, np.nan, np.nan, np.nan, np.nan], [10, 20, np.nan, np.nan, np.nan], [5, 3, np.nan, np.nan, 5], [5, 5, 5, 5, 5]]) print(data[1][1])

ds = xr.Dataset( data_vars=dict( data=(["x", "y"], data), ), coords=dict( x=(["x"], x), y=(["y"], y), ), )

interpolate_na will raise an exception

ds = ds.interpolate_na(dim="y", method="nearest")

If we replace the np.nan in [1][1] with a valid value, the exception is gone

ds["data"][1][1] = 15

Data now looks like

data = np.array([[np.nan, np.nan, np.nan, np.nan, np.nan],

[10, 15, np.nan, np.nan, np.nan],

[10, 20, np.nan, np.nan, np.nan],

[5, 3, np.nan, np.nan, 5],

[5, 5, 5, 5, 5]])

working interpolate_na

ds = ds.interpolate_na(dim="y", method="nearest") ```

What can be expected

In the case when interpolate_na raise an exception, we could instead expect either a warning or that the problematic row is ignored (like full nan rows are ignored)

See the below code which is part of xarray, which is called when using interpolate_na with method="nearest" (at least) and which calls scipy interp1d.

in xarray/core/missing.py ```python def func_interpolate_na(interpolator, y, x, **kwargs): """helper function to apply interpolation along 1 dimension""" # reversed arguments are so that attrs are preserved from da, not index # it would be nice if this wasn't necessary, works around: # "ValueError: assignment destination is read-only" in assignment below out = y.copy()

nans = pd.isnull(y)
nonans = ~nans

# fast track for no-nans and all-nans cases
n_nans = nans.sum()

if n_nans < 0 or n_nans >= len(y):
    return y

f = interpolator(x[nonans], y[nonans], **kwargs)
out[nans] = f(x[nans])
return out

```

This is the python if n_nans < 0 or n_nans == len(y): return y part that is interesting for me. What if the condition is replaced with if n_nans < 0 or n_nans >= len(y) -1: ?

Environment:

Output of <tt>xr.show_versions()</tt> INSTALLED VERSIONS ------------------ commit: None python: 3.7.10 (default, Jun 4 2021, 14:48:32) [GCC 7.5.0] python-bits: 64 OS: Linux OS-release: 4.15.0-142-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: fr_FR.UTF-8 LOCALE: ('fr_FR', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 0.19.0 pandas: 1.2.5 numpy: 1.20.0 scipy: 1.7.1 netCDF4: 1.5.7 pydap: None h5netcdf: None h5py: 2.10.0 Nio: None zarr: None cftime: 1.5.1 nc_time_axis: None PseudoNetCDF: None rasterio: 1.1.0 cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.09.1 distributed: 2021.09.1 matplotlib: 3.1.3 cartopy: 0.18.0 seaborn: None numbagg: None pint: None setuptools: 52.0.0.post20210125 pip: 21.1.3 conda: 4.10.3 pytest: None IPython: 7.28.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5994/reactions",
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 1
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.843ms · About: xarray-datasette