home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1303640708

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1303640708 I_kwDOAMm_X85Ns_qE 6782 xr.apply_ufunc: apply_variable_ufunc can get called again because of existing reference 2129135 closed 0     2 2022-07-13T15:58:09Z 2022-07-13T16:17:42Z 2022-07-13T16:03:18Z NONE      

What happened?

I wrapped np.fft.fft with xr.apply_ufunc, and wanted to add coordinates. Here I used np.fft.fftfreq and as input to that function a value derived from the original coordinates of the array passed in to xr.apply_ufunc. This derived value, a float, was still an xarray object.

When then using this value in combination with my result from the call to xr.apply_ufunc, the function apply_variable_ufunc got called again somehow and raised

What did you expect to happen?

It should not call apply_variable_ufunc again.

Minimal Complete Verifiable Example

```Python import xarray as xr import numpy as np

def fft( a: xr.DataArray, dim: str, newdim: str, n: int = None, norm: str = None, keep_attrs=None, ) -> xr.DataArray:

with_dask = a.chunks is not None
func = dask.array.fft.fft if with_dask else np.fft.fft

kwargs = {
    "n": n,
}

if with_dask and norm is not None:
    raise ValueError("norm is not supported with dask arrays")
else:
    kwargs["norm"] = norm

result = xr.apply_ufunc(
    func,
    a,
    kwargs=kwargs,
    input_core_dims=(
        (dim,),
    ),
    exclude_dims=set([dim]),
    output_core_dims=(
        (newdim,),
    ),
    vectorize=False,
    # dask="allowed" if with_dask else "forbidden",
    # output_dtypes=[np.complex128],
    # keep_attrs=_keep_attrs(keep_attrs),
)

# DFT size
n = n if n is not None else _get_length(a, dim)

# Coordinate spacing along `dim`
# convert to float to fix the issue!
delta = (a.coords[dim].diff(dim=dim).mean(dim=dim))
# print(delta)

# Coordinates for `newdim`
result[newdim] = np.fft.fftfreq(n, delta)
return result

duration = 10.0 fs = 8000.0 nsamples = int(fs * duration) f = 100.0 A = 2.0 n = 100 dim = "time" newdim = "frequency" signal = A * np.sin(2.0 * np.pi * f * np.arange(nsamples) / fs) signal = xr.DataArray(signal, dims=[dim], coords={dim: np.arange(nsamples)}).expand_dims("channel") result = fft(signal, n=n, dim=dim, newdim=newdim) ```

MVCE confirmation

  • [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [ ] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [ ] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [ ] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python ======================================================================================================================================== FAILURES ========================================================================================================================================= ___________________ testfft _____________________

def test_fft():
    duration = 10.0
    fs = 8000.0
    nsamples = int(fs * duration)
    f = 100.0
    A = 2.0
    n = 100
    dim = "time"
    newdim = "frequency"
    signal = A * np.sin(2.0 * np.pi * f * np.arange(nsamples) / fs)
    signal = xr.DataArray(signal, dims=[dim], coords={dim: np.arange(nsamples)}).expand_dims("channel")
  result = xarray_scipy.signal.fft(signal, n=n, dim=dim, newdim=newdim)

tests/test_signal.py:323:


xarray_scipy/signal.py:187: in fft result[newdim] = np.fft.fftfreq(n, delta) /nix/store/9mr36pjhxnlvdy00njp8dj22rd7n3yg3-python3-3.10.5-env/lib/python3.10/site-packages/numpy/fft/helper.py:169: in fftfreq return results * val ../xarray/xarray/core/arithmetic.py:70: in array_ufunc return apply_ufunc( ../xarray/xarray/core/computation.py:1132: in apply_ufunc return apply_dataarray_vfunc( ../xarray/xarray/core/computation.py:271: in apply_dataarray_vfunc result_var = func(*data_vars)


func = <ufunc 'multiply'>, signature = _UFuncSignature([(), ()], [()]), exclude_dims = frozenset(), dask = 'allowed', output_dtypes = None

def apply_variable_ufunc(
    func,
    *args,
    signature,
    exclude_dims=frozenset(),
    dask="forbidden",
    output_dtypes=None,
    vectorize=False,
    keep_attrs=False,
    dask_gufunc_kwargs=None,
):
    """Apply a ndarray level function over Variable and/or ndarray objects."""
    from .variable import Variable, as_compatible_data

    first_obj = _first_of_type(args, Variable)

    dim_sizes = unified_dim_sizes(
        (a for a in args if hasattr(a, "dims")), exclude_dims=exclude_dims
    )
    broadcast_dims = tuple(
        dim for dim in dim_sizes if dim not in signature.all_core_dims
    )
    output_dims = [broadcast_dims + out for out in signature.output_core_dims]

    input_data = [
        broadcast_compat_data(arg, broadcast_dims, core_dims)
        if isinstance(arg, Variable)
        else arg
        for arg, core_dims in zip(args, signature.input_core_dims)
    ]

    if any(is_duck_dask_array(array) for array in input_data):
        if dask == "forbidden":
            raise ValueError(
                "apply_ufunc encountered a dask array on an "
                "argument, but handling for dask arrays has not "
                "been enabled. Either set the ``dask`` argument "
                "or load your data into memory first with "
                "``.load()`` or ``.compute()``"
            )
        elif dask == "parallelized":
            numpy_func = func

            if dask_gufunc_kwargs is None:
                dask_gufunc_kwargs = {}
            else:
                dask_gufunc_kwargs = dask_gufunc_kwargs.copy()

            allow_rechunk = dask_gufunc_kwargs.get("allow_rechunk", None)
            if allow_rechunk is None:
                for n, (data, core_dims) in enumerate(
                    zip(input_data, signature.input_core_dims)
                ):
                    if is_duck_dask_array(data):
                        # core dimensions cannot span multiple chunks
                        for axis, dim in enumerate(core_dims, start=-len(core_dims)):
                            if len(data.chunks[axis]) != 1:
                                raise ValueError(
                                    f"dimension {dim} on {n}th function argument to "
                                    "apply_ufunc with dask='parallelized' consists of "
                                    "multiple chunks, but is also a core dimension. To "
                                    "fix, either rechunk into a single dask array chunk along "
                                    f"this dimension, i.e., ``.chunk({dim}: -1)``, or "
                                    "pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` "
                                    "but beware that this may significantly increase memory usage."
                                )
                dask_gufunc_kwargs["allow_rechunk"] = True

            output_sizes = dask_gufunc_kwargs.pop("output_sizes", {})
            if output_sizes:
                output_sizes_renamed = {}
                for key, value in output_sizes.items():
                    if key not in signature.all_output_core_dims:
                        raise ValueError(
                            f"dimension '{key}' in 'output_sizes' must correspond to output_core_dims"
                        )
                    output_sizes_renamed[signature.dims_map[key]] = value
                dask_gufunc_kwargs["output_sizes"] = output_sizes_renamed

            for key in signature.all_output_core_dims:
                if key not in signature.all_input_core_dims and key not in output_sizes:
                    raise ValueError(
                        f"dimension '{key}' in 'output_core_dims' needs corresponding (dim, size) in 'output_sizes'"
                    )

            def func(*arrays):
                import dask.array as da

                res = da.apply_gufunc(
                    numpy_func,
                    signature.to_gufunc_string(exclude_dims),
                    *arrays,
                    vectorize=vectorize,
                    output_dtypes=output_dtypes,
                    **dask_gufunc_kwargs,
                )

                # todo: covers for https://github.com/dask/dask/pull/6207
                #  remove when minimal dask version >= 2.17.0
                from dask import __version__ as dask_version

                if LooseVersion(dask_version) < LooseVersion("2.17.0"):
                    if signature.num_outputs > 1:
                        res = tuple(res)

                return res

        elif dask == "allowed":
            pass
        else:
            raise ValueError(
                "unknown setting for dask array handling in "
                "apply_ufunc: {}".format(dask)
            )
    else:
        if vectorize:
            func = _vectorize(
                func, signature, output_dtypes=output_dtypes, exclude_dims=exclude_dims
            )

    result_data = func(*input_data)

    if signature.num_outputs == 1:
        result_data = (result_data,)
    elif (
        not isinstance(result_data, tuple) or len(result_data) != signature.num_outputs
    ):
        raise ValueError(
            "applied function does not have the number of "
            "outputs specified in the ufunc signature. "
            "Result is not a tuple of {} elements: {!r}".format(
                signature.num_outputs, result_data
            )
        )

    output = []
    for dims, data in zip(output_dims, result_data):
        data = as_compatible_data(data)
        if data.ndim != len(dims):
          raise ValueError(
                "applied function returned data with unexpected "
                f"number of dimensions. Received {data.ndim} dimension(s) but "
                f"expected {len(dims)} dimensions with names: {dims!r}"
            )

E ValueError: applied function returned data with unexpected number of dimensions. Received 1 dimension(s) but expected 0 dimensions with names: ()

../xarray/xarray/core/computation.py:746: ValueError ```

Anything else we need to know?

No response

Environment

In [2]: xr.show_versions() /nix/store/9mr36pjhxnlvdy00njp8dj22rd7n3yg3-python3-3.10.5-env/lib/python3.10/site-packages/_distutils_hack/__init__.py:30: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") INSTALLED VERSIONS ------------------ commit: None python: 3.10.5 (main, Jun 6 2022, 12:05:50) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.15.43 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2022.3.0 pandas: 1.4.2 numpy: 1.21.6 scipy: 1.8.0 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2022.05.2 distributed: None matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2022.5.0 cupy: None pint: None sparse: None setuptools: 61.2.0.post0 pip: None conda: None pytest: 7.1.1 IPython: 8.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6782/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.811ms · About: xarray-datasette