home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where issue = 528701910 and user = 14314623 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • jbusecke · 3 ✖

issue 1

  • apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta · 3 ✖

author_association 1

  • CONTRIBUTOR 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
566637471 https://github.com/pydata/xarray/issues/3574#issuecomment-566637471 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NjYzNzQ3MQ== jbusecke 14314623 2019-12-17T16:22:35Z 2019-12-17T16:22:35Z CONTRIBUTOR

I can give it a shot if you could point me to the appropriate place, since I have never messed with the dask internals of xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565057853 https://github.com/pydata/xarray/issues/3574#issuecomment-565057853 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NTA1Nzg1Mw== jbusecke 14314623 2019-12-12T15:35:10Z 2019-12-12T15:35:10Z CONTRIBUTOR

This is the chunk setup

Might this be a problem resulting from numpy.vectorize?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
564843368 https://github.com/pydata/xarray/issues/3574#issuecomment-564843368 https://api.github.com/repos/pydata/xarray/issues/3574 MDEyOklzc3VlQ29tbWVudDU2NDg0MzM2OA== jbusecke 14314623 2019-12-12T04:22:02Z 2019-12-12T05:32:14Z CONTRIBUTOR

I am having a similar problem. This impacts some of my frequently used code to compute correlations.

Here is a simplified example that used to work with older dependencies: ``` import xarray as xr import numpy as np from scipy.stats import linregress

def _ufunc(aa,bb): out = linregress(aa,bb) return np.array([out.slope, out.intercept])

def wrapper(a, b, dim='time'): return xr.apply_ufunc( _ufunc,a,b, input_core_dims=[[dim], [dim]], output_core_dims=[["parameter"]], vectorize=True, dask="parallelized", output_dtypes=[a.dtype], output_sizes={"parameter": 2},) ```

This works when passing numpy arrays:

a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y']) b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time']) wrapper(a,b)

<xarray.DataArray (x: 3, y: 5, parameter: 2)> array([[[ 0.09958247, 0.36831431], [-0.54445474, 0.66997513], [-0.22894182, 0.65433402], [ 0.38536482, 0.20656073], [ 0.25083224, 0.46955618]], [[-0.21684891, 0.55521932], [ 0.51621616, 0.20869272], [-0.1502755 , 0.55526262], [-0.25452988, 0.60823538], [-0.20571622, 0.56950115]], [[-0.22810421, 0.50423622], [ 0.33002345, 0.36121484], [ 0.37744774, 0.33081058], [-0.10825559, 0.53772493], [-0.12576656, 0.51722167]]]) Dimensions without coordinates: x, y, parameter

But when I convert both arrays to dask arrays, I get the same error as @smartass101.

wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1}))

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-4-303b400356e2> in <module> 1 a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y']) 2 b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time']) ----> 3 wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1})) <ipython-input-1-4094fd485c95> in wrapper(a, b, dim) 16 dask="parallelized", 17 output_dtypes=[a.dtype], ---> 18 output_sizes={"parameter": 2},) ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, *args) 1042 join=join, 1043 exclude_dims=exclude_dims, -> 1044 keep_attrs=keep_attrs 1045 ) 1046 elif any(isinstance(a, Variable) for a in args): ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, *args) 232 233 data_vars = [getattr(a, "variable", a) for a in args] --> 234 result_var = func(*data_vars) 235 236 if signature.num_outputs > 1: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, *args) 601 "apply_ufunc: {}".format(dask) 602 ) --> 603 result_data = func(*input_data) 604 605 if signature.num_outputs == 1: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in func(*arrays) 591 signature, 592 output_dtypes, --> 593 output_sizes, 594 ) 595 ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in _apply_blockwise(func, args, input_dims, output_dims, signature, output_dtypes, output_sizes) 721 dtype=dtype, 722 concatenate=True, --> 723 new_axes=output_sizes 724 ) 725 ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, *args, **kwargs) 231 from .utils import compute_meta 232 --> 233 meta = compute_meta(func, dtype, *args[::2], **kwargs) 234 if meta is not None: 235 return Array(graph, out, chunks, meta=meta) ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/utils.py in compute_meta(func, _dtype, *args, **kwargs) 119 # with np.vectorize, such as dask.array.routines._isnonzero_vec(). 120 if isinstance(func, np.vectorize): --> 121 meta = func(*args_meta) 122 else: 123 try: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self, *args, **kwargs) 2089 vargs.extend([kwargs[_n] for _n in names]) 2090 -> 2091 return self._vectorize_call(func=func, args=vargs) 2092 2093 def _get_ufunc_and_otypes(self, func, args): ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args) 2155 """Vectorized call to `func` over positional `args`.""" 2156 if self.signature is not None: -> 2157 res = self._vectorize_call_with_signature(func, args) 2158 elif not args: 2159 res = func() ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args) 2229 for dims in output_core_dims 2230 for dim in dims): -> 2231 raise ValueError('cannot call `vectorize` with a signature ' 2232 'including new output dimensions on size 0 ' 2233 'inputs') ValueError: cannot call `vectorize` with a signature including new output dimensions on size 0 inputs

This used to work like a charm...I however was sloppy in testing this functionality (a good reminder always to write tests immediately 🙄 ), and I was not able to determine a combination of dependencies that would work. I am still experimenting and will report back

Could this behaviour be a bug introduced in dask at some point (as indicated by @smartass101 above)? cc'ing @dcherian @shoyer @mrocklin

EDIT: I can confirm that it seems to be a dask issue. If I restrict my dask version to <2.0, my tests (very similar to the above example) work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 19.602ms · About: xarray-datasette