html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3574#issuecomment-567082163,https://api.github.com/repos/pydata/xarray/issues/3574,567082163,MDEyOklzc3VlQ29tbWVudDU2NzA4MjE2Mw==,941907,2019-12-18T15:32:38Z,2019-12-18T15:32:38Z,NONE,"> `meta = np.ndarray if vectorize is True else None` if the user doesn't explicitly provide `meta`.
Yes, sorry, written this way I now see what you meant and that will likely work indeed.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-567077240,https://api.github.com/repos/pydata/xarray/issues/3574,567077240,MDEyOklzc3VlQ29tbWVudDU2NzA3NzI0MA==,2448579,2019-12-18T15:21:19Z,2019-12-18T15:21:19Z,MEMBER,Right the xarray solution is to set `meta = np.ndarray if vectorize is True else None` if the user doesn't explicitly provide `meta`. Or am I missing something? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-566938638,https://api.github.com/repos/pydata/xarray/issues/3574,566938638,MDEyOklzc3VlQ29tbWVudDU2NjkzODYzOA==,941907,2019-12-18T08:55:29Z,2019-12-18T08:55:29Z,NONE,"> `meta` should be passed to `blockwise` through `_apply_blockwise` with default `None` (I think) and `np.ndarray` if `vectorize is True`. You'll have to pass the `vectorize` kwarg down to this level I think.
I'm afraid that passing `meta=None` will not help as explained in
https://github.com/dask/dask/issues/5642 and seen around [this line](https://github.com/dask/dask/blob/3960c6518318f2417658c2fc47cd5b5ece726f8b/dask/array/blockwise.py#L230) because in that case `compute_meta` will be called which might fail with a `np.vectorize`-wrapped function.
I belive a better solution would be to address https://github.com/dask/dask/issues/5642 so that meta isn't computed even though we already provide an output `dtype`.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-566640524,https://api.github.com/repos/pydata/xarray/issues/3574,566640524,MDEyOklzc3VlQ29tbWVudDU2NjY0MDUyNA==,2448579,2019-12-17T16:29:35Z,2019-12-17T16:29:35Z,MEMBER,"`meta` should be passed to `blockwise` through `_apply_blockwise` with default `None` (I think) and `np.ndarray` if `vectorize is True`. You'll have to pass the `vectorize` kwarg down to this level I think.
https://github.com/pydata/xarray/blob/6ad59b93f814b48053b1a9eea61d7c43517105cb/xarray/core/computation.py#L579-L593","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-566637471,https://api.github.com/repos/pydata/xarray/issues/3574,566637471,MDEyOklzc3VlQ29tbWVudDU2NjYzNzQ3MQ==,14314623,2019-12-17T16:22:35Z,2019-12-17T16:22:35Z,CONTRIBUTOR,"I can give it a shot if you could point me to the appropriate place, since I have never messed with the dask internals of xarray. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-565194778,https://api.github.com/repos/pydata/xarray/issues/3574,565194778,MDEyOklzc3VlQ29tbWVudDU2NTE5NDc3OA==,2448579,2019-12-12T21:28:39Z,2019-12-12T21:28:39Z,MEMBER,@shoyer's option 1 should be a relatively simple xarray PR is one of you is up for it.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-565186199,https://api.github.com/repos/pydata/xarray/issues/3574,565186199,MDEyOklzc3VlQ29tbWVudDU2NTE4NjE5OQ==,941907,2019-12-12T21:04:33Z,2019-12-12T21:04:33Z,NONE,"> The problem is that Dask, as of version 2.0, calls functions applied to dask arrays with size zero inputs, to figure out the output array type, e.g., is the output a dense numpy.ndarray or a sparse array?
Yes, now I recall that this was the issue, yeah. It doesn't even depend on your actual data really.
Possible option 3. is to address https://github.com/dask/dask/issues/5642 directly (haven't found time to do a PR yet). Essentially from the code described in that issue I have the feeling that if a `dtype` is passed (as `apply_ufunc` does), then `meta` should not need to be calculated.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-565107345,https://api.github.com/repos/pydata/xarray/issues/3574,565107345,MDEyOklzc3VlQ29tbWVudDU2NTEwNzM0NQ==,1217238,2019-12-12T17:33:43Z,2019-12-12T17:33:43Z,MEMBER,"The problem is that Dask, as of version 2.0, calls functions applied to dask arrays with size zero inputs, to figure out the output array type, e.g., is the output a dense numpy.ndarray or a sparse array?
Unfortunately, `numpy.vectorize` doesn't know how to large of a size 0 array to make, because it doesn't have anything like the `output_sizes` argument.
For xarray, we have a couple of options:
1. we can safely assume that if the applied function is a `np.vectorize`, then it should pass `meta=np.ndarray` into the relevant dask functions (e.g., `dask.array.blockwise`). This should avoid the need to evaluate with size 0 arrays.
1. we could add an `output_sizes` argument to `np.vectorize` either upstream in NumPy or into a wrapper in Xarray.
(1) is probably easiest here.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-565057853,https://api.github.com/repos/pydata/xarray/issues/3574,565057853,MDEyOklzc3VlQ29tbWVudDU2NTA1Nzg1Mw==,14314623,2019-12-12T15:35:10Z,2019-12-12T15:35:10Z,CONTRIBUTOR,"This is the chunk setup

Might this be a problem resulting from `numpy.vectorize`?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-564934693,https://api.github.com/repos/pydata/xarray/issues/3574,564934693,MDEyOklzc3VlQ29tbWVudDU2NDkzNDY5Mw==,941907,2019-12-12T09:57:18Z,2019-12-12T09:57:28Z,NONE,Sounds similar. But I'm not sure why you get the 0d issue when even your chunks don't (from a quick reading) seem to have a 0 size in any of the dimensions. Could you please show us what is the resulting chunk setup?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-564843368,https://api.github.com/repos/pydata/xarray/issues/3574,564843368,MDEyOklzc3VlQ29tbWVudDU2NDg0MzM2OA==,14314623,2019-12-12T04:22:02Z,2019-12-12T05:32:14Z,CONTRIBUTOR,"I am having a similar problem. This impacts some of my [frequently used code to compute correlations](https://github.com/jbusecke/xarrayutils/blob/7b09a2bdc70f035e290e75419c2d025b7267adf4/xarrayutils/utils.py#L52).
Here is a simplified example that used to work with older dependencies:
```
import xarray as xr
import numpy as np
from scipy.stats import linregress
def _ufunc(aa,bb):
out = linregress(aa,bb)
return np.array([out.slope, out.intercept])
def wrapper(a, b, dim='time'):
return xr.apply_ufunc(
_ufunc,a,b,
input_core_dims=[[dim], [dim]],
output_core_dims=[[""parameter""]],
vectorize=True,
dask=""parallelized"",
output_dtypes=[a.dtype],
output_sizes={""parameter"": 2},)
```
This works when passing numpy arrays:
```
a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y'])
b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time'])
wrapper(a,b)
```
array([[[ 0.09958247, 0.36831431],
[-0.54445474, 0.66997513],
[-0.22894182, 0.65433402],
[ 0.38536482, 0.20656073],
[ 0.25083224, 0.46955618]],
[[-0.21684891, 0.55521932],
[ 0.51621616, 0.20869272],
[-0.1502755 , 0.55526262],
[-0.25452988, 0.60823538],
[-0.20571622, 0.56950115]],
[[-0.22810421, 0.50423622],
[ 0.33002345, 0.36121484],
[ 0.37744774, 0.33081058],
[-0.10825559, 0.53772493],
[-0.12576656, 0.51722167]]])
Dimensions without coordinates: x, y, parameter
But when I convert both arrays to dask arrays, I get the same error as @smartass101.
```
wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1}))
```
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
1 a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y'])
2 b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time'])
----> 3 wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1}))
in wrapper(a, b, dim)
16 dask=""parallelized"",
17 output_dtypes=[a.dtype],
---> 18 output_sizes={""parameter"": 2},)
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args)
1042 join=join,
1043 exclude_dims=exclude_dims,
-> 1044 keep_attrs=keep_attrs
1045 )
1046 elif any(isinstance(a, Variable) for a in args):
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
232
233 data_vars = [getattr(a, ""variable"", a) for a in args]
--> 234 result_var = func(*data_vars)
235
236 if signature.num_outputs > 1:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, *args)
601 ""apply_ufunc: {}"".format(dask)
602 )
--> 603 result_data = func(*input_data)
604
605 if signature.num_outputs == 1:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in func(*arrays)
591 signature,
592 output_dtypes,
--> 593 output_sizes,
594 )
595
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in _apply_blockwise(func, args, input_dims, output_dims, signature, output_dtypes, output_sizes)
721 dtype=dtype,
722 concatenate=True,
--> 723 new_axes=output_sizes
724 )
725
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, *args, **kwargs)
231 from .utils import compute_meta
232
--> 233 meta = compute_meta(func, dtype, *args[::2], **kwargs)
234 if meta is not None:
235 return Array(graph, out, chunks, meta=meta)
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/utils.py in compute_meta(func, _dtype, *args, **kwargs)
119 # with np.vectorize, such as dask.array.routines._isnonzero_vec().
120 if isinstance(func, np.vectorize):
--> 121 meta = func(*args_meta)
122 else:
123 try:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self, *args, **kwargs)
2089 vargs.extend([kwargs[_n] for _n in names])
2090
-> 2091 return self._vectorize_call(func=func, args=vargs)
2092
2093 def _get_ufunc_and_otypes(self, func, args):
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args)
2155 """"""Vectorized call to `func` over positional `args`.""""""
2156 if self.signature is not None:
-> 2157 res = self._vectorize_call_with_signature(func, args)
2158 elif not args:
2159 res = func()
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args)
2229 for dims in output_core_dims
2230 for dim in dims):
-> 2231 raise ValueError('cannot call `vectorize` with a signature '
2232 'including new output dimensions on size 0 '
2233 'inputs')
ValueError: cannot call `vectorize` with a signature including new output dimensions on size 0 inputs
This used to work like a charm...I however was sloppy in testing this functionality (a good reminder always to write tests immediately 🙄 ), and I was not able to determine a combination of dependencies that would work. I am still experimenting and will report back
Could this behaviour be a bug introduced in dask at some point (as indicated by @smartass101 above)? cc'ing @dcherian @shoyer @mrocklin
EDIT: I can confirm that it seems to be a dask issue. If I restrict my dask version to `<2.0`, my tests (very similar to the above example) work.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910
https://github.com/pydata/xarray/issues/3574#issuecomment-558616375,https://api.github.com/repos/pydata/xarray/issues/3574,558616375,MDEyOklzc3VlQ29tbWVudDU1ODYxNjM3NQ==,941907,2019-11-26T12:56:47Z,2019-11-26T12:56:47Z,NONE,"Another approach would be to bypass `compute_meta` in `dask.blockwise` if `dtype` is provided which seems to be hinted at here
https://github.com/dask/dask/blob/3960c6518318f2417658c2fc47cd5b5ece726f8b/dask/array/blockwise.py#L234
Perhaps this is an oversight in `dask`, what do you think?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,528701910