issue_comments
3 rows where issue = 528701910 and user = 14314623 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta · 3 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
566637471 | https://github.com/pydata/xarray/issues/3574#issuecomment-566637471 | https://api.github.com/repos/pydata/xarray/issues/3574 | MDEyOklzc3VlQ29tbWVudDU2NjYzNzQ3MQ== | jbusecke 14314623 | 2019-12-17T16:22:35Z | 2019-12-17T16:22:35Z | CONTRIBUTOR | I can give it a shot if you could point me to the appropriate place, since I have never messed with the dask internals of xarray. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910 | |
565057853 | https://github.com/pydata/xarray/issues/3574#issuecomment-565057853 | https://api.github.com/repos/pydata/xarray/issues/3574 | MDEyOklzc3VlQ29tbWVudDU2NTA1Nzg1Mw== | jbusecke 14314623 | 2019-12-12T15:35:10Z | 2019-12-12T15:35:10Z | CONTRIBUTOR | This is the chunk setup
Might this be a problem resulting from |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910 | |
564843368 | https://github.com/pydata/xarray/issues/3574#issuecomment-564843368 | https://api.github.com/repos/pydata/xarray/issues/3574 | MDEyOklzc3VlQ29tbWVudDU2NDg0MzM2OA== | jbusecke 14314623 | 2019-12-12T04:22:02Z | 2019-12-12T05:32:14Z | CONTRIBUTOR | I am having a similar problem. This impacts some of my frequently used code to compute correlations. Here is a simplified example that used to work with older dependencies: ``` import xarray as xr import numpy as np from scipy.stats import linregress def _ufunc(aa,bb): out = linregress(aa,bb) return np.array([out.slope, out.intercept]) def wrapper(a, b, dim='time'): return xr.apply_ufunc( _ufunc,a,b, input_core_dims=[[dim], [dim]], output_core_dims=[["parameter"]], vectorize=True, dask="parallelized", output_dtypes=[a.dtype], output_sizes={"parameter": 2},) ``` This works when passing numpy arrays:
<xarray.DataArray (x: 3, y: 5, parameter: 2)>
array([[[ 0.09958247, 0.36831431],
[-0.54445474, 0.66997513],
[-0.22894182, 0.65433402],
[ 0.38536482, 0.20656073],
[ 0.25083224, 0.46955618]],
[[-0.21684891, 0.55521932],
[ 0.51621616, 0.20869272],
[-0.1502755 , 0.55526262],
[-0.25452988, 0.60823538],
[-0.20571622, 0.56950115]],
[[-0.22810421, 0.50423622],
[ 0.33002345, 0.36121484],
[ 0.37744774, 0.33081058],
[-0.10825559, 0.53772493],
[-0.12576656, 0.51722167]]])
Dimensions without coordinates: x, y, parameter
But when I convert both arrays to dask arrays, I get the same error as @smartass101.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-303b400356e2> in <module>
1 a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y'])
2 b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time'])
----> 3 wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1}))
<ipython-input-1-4094fd485c95> in wrapper(a, b, dim)
16 dask="parallelized",
17 output_dtypes=[a.dtype],
---> 18 output_sizes={"parameter": 2},)
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args)
1042 join=join,
1043 exclude_dims=exclude_dims,
-> 1044 keep_attrs=keep_attrs
1045 )
1046 elif any(isinstance(a, Variable) for a in args):
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args)
232
233 data_vars = [getattr(a, "variable", a) for a in args]
--> 234 result_var = func(*data_vars)
235
236 if signature.num_outputs > 1:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, *args)
601 "apply_ufunc: {}".format(dask)
602 )
--> 603 result_data = func(*input_data)
604
605 if signature.num_outputs == 1:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in func(*arrays)
591 signature,
592 output_dtypes,
--> 593 output_sizes,
594 )
595
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in _apply_blockwise(func, args, input_dims, output_dims, signature, output_dtypes, output_sizes)
721 dtype=dtype,
722 concatenate=True,
--> 723 new_axes=output_sizes
724 )
725
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, *args, **kwargs)
231 from .utils import compute_meta
232
--> 233 meta = compute_meta(func, dtype, *args[::2], **kwargs)
234 if meta is not None:
235 return Array(graph, out, chunks, meta=meta)
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/utils.py in compute_meta(func, _dtype, *args, **kwargs)
119 # with np.vectorize, such as dask.array.routines._isnonzero_vec().
120 if isinstance(func, np.vectorize):
--> 121 meta = func(*args_meta)
122 else:
123 try:
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self, *args, **kwargs)
2089 vargs.extend([kwargs[_n] for _n in names])
2090
-> 2091 return self._vectorize_call(func=func, args=vargs)
2092
2093 def _get_ufunc_and_otypes(self, func, args):
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args)
2155 """Vectorized call to `func` over positional `args`."""
2156 if self.signature is not None:
-> 2157 res = self._vectorize_call_with_signature(func, args)
2158 elif not args:
2159 res = func()
~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args)
2229 for dims in output_core_dims
2230 for dim in dims):
-> 2231 raise ValueError('cannot call `vectorize` with a signature '
2232 'including new output dimensions on size 0 '
2233 'inputs')
ValueError: cannot call `vectorize` with a signature including new output dimensions on size 0 inputs
This used to work like a charm...I however was sloppy in testing this functionality (a good reminder always to write tests immediately 🙄 ), and I was not able to determine a combination of dependencies that would work. I am still experimenting and will report back Could this behaviour be a bug introduced in dask at some point (as indicated by @smartass101 above)? cc'ing @dcherian @shoyer @mrocklin EDIT: I can confirm that it seems to be a dask issue. If I restrict my dask version to |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1