html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/2714#issuecomment-457800642,https://api.github.com/repos/pydata/xarray/issues/2714,457800642,MDEyOklzc3VlQ29tbWVudDQ1NzgwMDY0Mg==,1796208,2019-01-26T04:22:42Z,2019-01-26T04:22:42Z,NONE,"Unfortunately neither of your suggestions work. With the second, I get the error: * ValueError: parameter 'value': expected array with shape (10000, 100), got (10000, 245) With the first: * ValueError: operands could not be broadcast together with shapes (5000,100,245) (100,) It's okay. I have something that works. And it's deterministic :D","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297 https://github.com/pydata/xarray/issues/2714#issuecomment-457798552,https://api.github.com/repos/pydata/xarray/issues/2714,457798552,MDEyOklzc3VlQ29tbWVudDQ1Nzc5ODU1Mg==,1796208,2019-01-26T03:47:08Z,2019-01-26T03:47:08Z,NONE,"> The behavior is definitely deterministic, if hard to understand! phew!","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297 https://github.com/pydata/xarray/issues/2714#issuecomment-457798514,https://api.github.com/repos/pydata/xarray/issues/2714,457798514,MDEyOklzc3VlQ29tbWVudDQ1Nzc5ODUxNA==,1796208,2019-01-26T03:46:36Z,2019-01-26T03:46:36Z,NONE,"> Maybe it would help to describe what you were trying to do here. Sure - thanks! I have a dataset that's long, the sample code shown below is 200k rows, but the full dataset will be much larger. I'm interested in pairwise distances except not for all rows, just the distances for few thousand rows, wrt to the full 200k. Here's how I hack this together: My starting array ```python df_array = xr.DataArray(df) df_array = df_array.rename({PIVOT: 'all_sites'}) df_array array([[0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], ..., [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.], [0., 0., 0., ..., 0., 0., 0.]]) Coordinates: * all_sites (all_sites) object '0.gravatar.com||gprofiles.js||Gravatar.init' ... 'кÑ\x83Ñ\x80Ñ\x81Ñ\x8b.1Ñ\x81енÑ\x82Ñ\x8fбÑ\x80Ñ\x8f.Ñ\x80Ñ\x84||store.js||store.set' * dim_1 (dim_1) object 'AnalyserNode.connect' ... 'HTMLCanvasElement.previousSibling' ``` My slice of the array ```python sites_of_interest = [sub list of all sites] df_dye_array = xr.DataArray(df.loc[sites_of_interest]) df_dye_array = df_dye_array.rename({PIVOT: 'dye_sites'}) ``` Chunk ```python df_array_c = df_array.chunk({'all_sites': 10_000}) df_dye_array_c = df_dye_array.chunk({'dye_sites': 100}) ``` Get distances ```python def get_chebyshev_distances_xarray_ufunc(df_array, df_dye_array): chebyshev = lambda x: np.abs(df_array[:,0,:] - x).max(axis=1) result = np.apply_along_axis(chebyshev, 1, df_dye_array).T return result distance_array = xr.apply_ufunc( get_chebyshev_distances_xarray_ufunc, df_array_c, df_dye_array_c, dask='parallelized', output_dtypes=[float], input_core_dims=[['dim_1'], ['dim_1']], ) ``` What I get out is an array with the length of my original array and the width of my sites of interest where each number is the chebyshev distance between their respective rows of the original dataset (which are 245 long).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297 https://github.com/pydata/xarray/issues/2714#issuecomment-457798029,https://api.github.com/repos/pydata/xarray/issues/2714,457798029,MDEyOklzc3VlQ29tbWVudDQ1Nzc5ODAyOQ==,1796208,2019-01-26T03:38:31Z,2019-01-26T03:38:31Z,NONE,"Can you clarify one thing in your note. >> unlabeled versions of da and db are given ""broadcastable"" shapes (1, 1000, 100) and (1000, 100) Is it `(1000, 1, 100)` as my code seems to return, or, as you said `(1, 1000, 100)`? Is it deterministic? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297 https://github.com/pydata/xarray/issues/2714#issuecomment-457797658,https://api.github.com/repos/pydata/xarray/issues/2714,457797658,MDEyOklzc3VlQ29tbWVudDQ1Nzc5NzY1OA==,1796208,2019-01-26T03:32:10Z,2019-01-26T03:32:10Z,NONE,"Hi, I will have to think about your response a lot more to see if I can wrap my head around it. In the meantime I'm not sure I have my input_core_dims correct, but that's the only configuration I could get to work. I chunk along row_a, and row_b and I output a new array with the dims [row_a, row_b]. By trial and error, the above configuration is the only one I could find where I got out the dims I was expecting and didn't get an error.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297 https://github.com/pydata/xarray/issues/2714#issuecomment-457777423,https://api.github.com/repos/pydata/xarray/issues/2714,457777423,MDEyOklzc3VlQ29tbWVudDQ1Nzc3NzQyMw==,1796208,2019-01-26T00:09:24Z,2019-01-26T00:09:24Z,NONE,"I should add, if I pass in plain numpy arrays then I do not have this problem. But ultimately I want to pass in a chunked DataArray, as described here: http://xarray.pydata.org/en/stable/dask.html#automatic-parallelization (this is my whole reason for using xarray). The work around is easy I just use `da[:,0,:]` but it's odd!","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,403378297