issues: 305757822

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
305757822	MDU6SXNzdWUzMDU3NTc4MjI=	1995	apply_ufunc support for chunks on input_core_dims	6213168	open	0			13	2018-03-15T23:50:22Z	2021-05-17T18:59:18Z		MEMBER				I am trying to optimize the following function: `c = (a * b).sum('x', skipna=False)` where a and b are xarray.DataArray's, both with dimension x and both with dask backend. I successfully obtained a 5.5x speedup with the following: `@numba.guvectorize(['void(float64[:], float64[:], float64[:])'], '(n),(n)->()', nopython=True, cache=True) def mulsum(a, b, res): acc = 0 for i in range(a.size): acc += a[i] * b[i] res.flat[0] = acc c = xarray.apply_ufunc( mulsum, a, b, input_core_dims=[['x'], ['x']], dask='parallelized', output_dtypes=[float])` The problem is that this introduces a (quite problematic, in my case) constraint that a and b can't be chunked on dimension x - which is theoretically avoidable as long as the kernel function doesn't need interaction between x[i] and x[j] (e.g. it can't work for an interpolator, which would require to rely on dask ghosting). Proposal Add a parameter to apply_ufunc, `reduce_func=None`. reduce_func is a function which takes as input two parameters a, b that are the output of func. apply_ufunc will invoke it whenever there's chunking on an input_core_dim. e.g. my use case above would simply become: `c = xarray.apply_ufunc( mulsum, a, b, input_core_dims=[['x'], ['x']], dask='parallelized', output_dtypes=[float], reduce_func=operator.sum)` So if I have 2 chunks in a and b on dimension x, apply_ufunc will internally do `c1 = mulsum(a1, b1) c2 = mulsum(a2, b2) c = operator.sum(c1, c2)` Note that reduce_func will be invoked exclusively in presence of dask='parallelized' and when there's chunking on one or more of the input_core_dims. If reduce_func is left to None, apply_ufunc will keep crashing like it does now.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/1995/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

0 rows from issues_id in issues_labels
13 rows from issue in issue_comments