id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 473000845,MDU6SXNzdWU0NzMwMDA4NDU=,3161,.reduce() on a DataArray with Dask distributed immediately executes the preceding portions of the computational graph,4762711,closed,0,,,4,2019-07-25T18:09:22Z,2019-11-13T15:48:46Z,2019-11-13T15:48:46Z,NONE,,,,"#### MCVE Code Sample `.mean()` on a `DataArray` pointing to a Dask array returns a Dask array-containing `DataArray` as expected: ```python da = xr.DataArray(np.zeros((500, 500, 500))).chunk((100, 100, 100)).mean('dim_0') da ``` ``` dask.array Dimensions without coordinates: dim_1, dim_2 ``` Calling `.compute()` on this result produces the expected result: ```python da = xr.DataArray(np.zeros((500, 500, 500))).chunk((100, 100, 100)).mean('dim_0').compute() ``` ``` array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Dimensions without coordinates: dim_1, dim_2 ``` The `.reduce()` method immediately executes all of the previously queued computations leading up to the new reduce method before even calling the supplied function. ```python def func(x, axis=None): return x da = xr.DataArray(np.zeros((500, 500, 500))).chunk((100, 100, 100)).mean('dim_0').reduce(func) ``` ``` array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Dimensions without coordinates: dim_1, dim_2 ``` #### Expected Output A Dask array when `.reduce(func)` isn't followed up by `.compute()`. #### Problem Description When using Dask distributed, the computational graph you are constructing is immediately executed if you call `.reduce()` instead of adding that function as another node in the DAG. This graph execution happens before the function you pass to reduce is called. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 21:52:21) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-693.21.1.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.3 pandas: 0.25.0 numpy: 1.16.4 scipy: 1.3.0 netCDF4: 1.5.1.2 pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.1.0 distributed: 2.1.0 matplotlib: 3.1.1 cartopy: None seaborn: None numbagg: None setuptools: 41.0.1 pip: 19.2.1 conda: None pytest: 5.0.1 IPython: 7.6.1 sphinx: 2.1.2
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3161/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue