id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 305757822,MDU6SXNzdWUzMDU3NTc4MjI=,1995,apply_ufunc support for chunks on input_core_dims,6213168,open,0,,,13,2018-03-15T23:50:22Z,2021-05-17T18:59:18Z,,MEMBER,,,,"I am trying to optimize the following function: c = (a * b).sum('x', skipna=False) where a and b are xarray.DataArray's, both with dimension x and both with dask backend. I successfully obtained a 5.5x speedup with the following: @numba.guvectorize(['void(float64[:], float64[:], float64[:])'], '(n),(n)->()', nopython=True, cache=True) def mulsum(a, b, res): acc = 0 for i in range(a.size): acc += a[i] * b[i] res.flat[0] = acc c = xarray.apply_ufunc( mulsum, a, b, input_core_dims=[['x'], ['x']], dask='parallelized', output_dtypes=[float]) The problem is that this introduces a (quite problematic, in my case) constraint that a and b can't be chunked on dimension x - which is theoretically avoidable as long as the kernel function doesn't need interaction between x[i] and x[j] (e.g. it can't work for an interpolator, which would require to rely on dask ghosting). # Proposal Add a parameter to apply_ufunc, ``reduce_func=None``. reduce_func is a function which takes as input two parameters a, b that are the output of func. apply_ufunc will invoke it whenever there's chunking on an input_core_dim. e.g. my use case above would simply become: c = xarray.apply_ufunc( mulsum, a, b, input_core_dims=[['x'], ['x']], dask='parallelized', output_dtypes=[float], reduce_func=operator.sum) So if I have 2 chunks in a and b on dimension x, apply_ufunc will internally do c1 = mulsum(a1, b1) c2 = mulsum(a2, b2) c = operator.sum(c1, c2) Note that reduce_func will be invoked exclusively in presence of dask='parallelized' and when there's chunking on one or more of the input_core_dims. If reduce_func is left to None, apply_ufunc will keep crashing like it does now. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1995/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 523438384,MDExOlB1bGxSZXF1ZXN0MzQxNDQyMTI4,3537,Numpy 1.18 support,6213168,closed,0,,,13,2019-11-15T12:17:32Z,2019-11-19T14:06:50Z,2019-11-19T14:06:46Z,MEMBER,,0,pydata/xarray/pulls/3537,"Fix mean() and nanmean() for datetime64 arrays on numpy backend when upgrading from numpy 1.17 to 1.18. All other nan-reductions on datetime64s were broken before and remain broken. mean() on datetime64 and dask was broken before and remains broken. - [x] Closes #3409 - [x] Passes `black . && mypy . && flake8` - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3537/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 297631403,MDExOlB1bGxSZXF1ZXN0MTY5NTEyMjU1,1915,h5netcdf new API support,6213168,closed,0,,,13,2018-02-15T23:15:55Z,2018-05-11T23:49:00Z,2018-05-08T02:25:40Z,MEMBER,,0,pydata/xarray/pulls/1915,"Closes #1536 Support arbitrary compression plugins through the h5netcdf new API. Done: - public API and docstrings (untested) - implementation - unit tests - What's New","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1915/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull