home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 544210342

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/issues/3413#issuecomment-544210342 https://api.github.com/repos/pydata/xarray/issues/3413 544210342 MDEyOklzc3VlQ29tbWVudDU0NDIxMDM0Mg== 6213168 2019-10-20T01:06:27Z 2019-10-20T01:06:27Z MEMBER

It's working as intended.

apply_ufunc verifies that indices are aligned. Note the optional parameter join='exact'. You have two implicit pd.RangeIndex on dimension t that have a different number of elements - which means they are not aligned. Hence, when apply_ufunc internally calls xarray.align(Xda, yda, join="exact"), it falls over.

You have two options: 1. add join='outer' to the apply_ufunc call, which will cause the shorter of the two variables to be padded with NaNs. You'll also need to replace mean() with nanmean() in your kernel. This however is horribly inefficient. 2. rename one of the two dimensions to tell apply_ufunc that they aren't meant to be aligned:

python out = xr.apply_ufunc( diff_mean, Xda, yda.rename({"t": "t2"}), dask="parallelized", input_core_dims=[['t'], ['t2']], output_core_dims=[[]], output_dtypes=[np.float], )

While on the topic, note that vectorize=True is asking xarray to slice the numpy array, do a for loop in pure python applying your kernel multiple times, and then concatenate the output back together - that is, horribly slow. If you can avoid it, you should, and your kernel definitely can be changed to process arbitrary unknown dimensions:

python def diff_mean(X, y): assert X.shape[-1] != y.shape[-1] return X.mean(axis=-1) - y.mean(axis=-1)

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  508743579
Powered by Datasette · Queries took 407.899ms · About: xarray-datasette