github: issue_comments: 12 rows where issue = 528701910 sorted by updated

12 rows where issue = 528701910 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
567082163	https://github.com/pydata/xarray/issues/3574#issuecomment-567082163	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NzA4MjE2Mw==	smartass101 941907	2019-12-18T15:32:38Z	2019-12-18T15:32:38Z	NONE	`meta = np.ndarray if vectorize is True else None` if the user doesn't explicitly provide `meta`. Yes, sorry, written this way I now see what you meant and that will likely work indeed.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
567077240	https://github.com/pydata/xarray/issues/3574#issuecomment-567077240	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NzA3NzI0MA==	dcherian 2448579	2019-12-18T15:21:19Z	2019-12-18T15:21:19Z	MEMBER	Right the xarray solution is to set `meta = np.ndarray if vectorize is True else None` if the user doesn't explicitly provide `meta`. Or am I missing something?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
566938638	https://github.com/pydata/xarray/issues/3574#issuecomment-566938638	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NjkzODYzOA==	smartass101 941907	2019-12-18T08:55:29Z	2019-12-18T08:55:29Z	NONE	`meta` should be passed to `blockwise` through `_apply_blockwise` with default `None` (I think) and `np.ndarray` if `vectorize is True`. You'll have to pass the `vectorize` kwarg down to this level I think. I'm afraid that passing `meta=None` will not help as explained in https://github.com/dask/dask/issues/5642 and seen around this line because in that case `compute_meta` will be called which might fail with a `np.vectorize`-wrapped function. I belive a better solution would be to address https://github.com/dask/dask/issues/5642 so that meta isn't computed even though we already provide an output `dtype`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
566640524	https://github.com/pydata/xarray/issues/3574#issuecomment-566640524	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NjY0MDUyNA==	dcherian 2448579	2019-12-17T16:29:35Z	2019-12-17T16:29:35Z	MEMBER	`meta` should be passed to `blockwise` through `_apply_blockwise` with default `None` (I think) and `np.ndarray` if `vectorize is True`. You'll have to pass the `vectorize` kwarg down to this level I think. https://github.com/pydata/xarray/blob/6ad59b93f814b48053b1a9eea61d7c43517105cb/xarray/core/computation.py#L579-L593	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
566637471	https://github.com/pydata/xarray/issues/3574#issuecomment-566637471	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NjYzNzQ3MQ==	jbusecke 14314623	2019-12-17T16:22:35Z	2019-12-17T16:22:35Z	CONTRIBUTOR	I can give it a shot if you could point me to the appropriate place, since I have never messed with the dask internals of xarray.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565194778	https://github.com/pydata/xarray/issues/3574#issuecomment-565194778	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NTE5NDc3OA==	dcherian 2448579	2019-12-12T21:28:39Z	2019-12-12T21:28:39Z	MEMBER	@shoyer's option 1 should be a relatively simple xarray PR is one of you is up for it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565186199	https://github.com/pydata/xarray/issues/3574#issuecomment-565186199	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NTE4NjE5OQ==	smartass101 941907	2019-12-12T21:04:33Z	2019-12-12T21:04:33Z	NONE	The problem is that Dask, as of version 2.0, calls functions applied to dask arrays with size zero inputs, to figure out the output array type, e.g., is the output a dense numpy.ndarray or a sparse array? Yes, now I recall that this was the issue, yeah. It doesn't even depend on your actual data really. Possible option 3. is to address https://github.com/dask/dask/issues/5642 directly (haven't found time to do a PR yet). Essentially from the code described in that issue I have the feeling that if a `dtype` is passed (as `apply_ufunc` does), then `meta` should not need to be calculated.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565107345	https://github.com/pydata/xarray/issues/3574#issuecomment-565107345	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NTEwNzM0NQ==	shoyer 1217238	2019-12-12T17:33:43Z	2019-12-12T17:33:43Z	MEMBER	The problem is that Dask, as of version 2.0, calls functions applied to dask arrays with size zero inputs, to figure out the output array type, e.g., is the output a dense numpy.ndarray or a sparse array? Unfortunately, `numpy.vectorize` doesn't know how to large of a size 0 array to make, because it doesn't have anything like the `output_sizes` argument. For xarray, we have a couple of options: 1. we can safely assume that if the applied function is a `np.vectorize`, then it should pass `meta=np.ndarray` into the relevant dask functions (e.g., `dask.array.blockwise`). This should avoid the need to evaluate with size 0 arrays. 1. we could add an `output_sizes` argument to `np.vectorize` either upstream in NumPy or into a wrapper in Xarray. (1) is probably easiest here.	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
565057853	https://github.com/pydata/xarray/issues/3574#issuecomment-565057853	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NTA1Nzg1Mw==	jbusecke 14314623	2019-12-12T15:35:10Z	2019-12-12T15:35:10Z	CONTRIBUTOR	This is the chunk setup Might this be a problem resulting from `numpy.vectorize`?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
564934693	https://github.com/pydata/xarray/issues/3574#issuecomment-564934693	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NDkzNDY5Mw==	smartass101 941907	2019-12-12T09:57:18Z	2019-12-12T09:57:28Z	NONE	Sounds similar. But I'm not sure why you get the 0d issue when even your chunks don't (from a quick reading) seem to have a 0 size in any of the dimensions. Could you please show us what is the resulting chunk setup?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
564843368	https://github.com/pydata/xarray/issues/3574#issuecomment-564843368	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU2NDg0MzM2OA==	jbusecke 14314623	2019-12-12T04:22:02Z	2019-12-12T05:32:14Z	CONTRIBUTOR	I am having a similar problem. This impacts some of my frequently used code to compute correlations. Here is a simplified example that used to work with older dependencies: ``` import xarray as xr import numpy as np from scipy.stats import linregress def _ufunc(aa,bb): out = linregress(aa,bb) return np.array([out.slope, out.intercept]) def wrapper(a, b, dim='time'): return xr.apply_ufunc( _ufunc,a,b, input_core_dims=[[dim], [dim]], output_core_dims=[["parameter"]], vectorize=True, dask="parallelized", output_dtypes=[a.dtype], output_sizes={"parameter": 2},) ``` This works when passing numpy arrays: `a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y']) b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time']) wrapper(a,b)` <xarray.DataArray (x: 3, y: 5, parameter: 2)> array([[[ 0.09958247, 0.36831431], [-0.54445474, 0.66997513], [-0.22894182, 0.65433402], [ 0.38536482, 0.20656073], [ 0.25083224, 0.46955618]], [[-0.21684891, 0.55521932], [ 0.51621616, 0.20869272], [-0.1502755 , 0.55526262], [-0.25452988, 0.60823538], [-0.20571622, 0.56950115]], [[-0.22810421, 0.50423622], [ 0.33002345, 0.36121484], [ 0.37744774, 0.33081058], [-0.10825559, 0.53772493], [-0.12576656, 0.51722167]]]) Dimensions without coordinates: x, y, parameter But when I convert both arrays to dask arrays, I get the same error as @smartass101. `wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1}))` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-4-303b400356e2> in <module> 1 a = xr.DataArray(np.random.rand(3, 13, 5), dims=['x', 'time', 'y']) 2 b = xr.DataArray(np.random.rand(3, 5, 13), dims=['x','y', 'time']) ----> 3 wrapper(a.chunk({'x':2, 'time':-1}),b.chunk({'x':2, 'time':-1})) <ipython-input-1-4094fd485c95> in wrapper(a, b, dim) 16 dask="parallelized", 17 output_dtypes=[a.dtype], ---> 18 output_sizes={"parameter": 2},) ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, args) 1042 join=join, 1043 exclude_dims=exclude_dims, -> 1044 keep_attrs=keep_attrs 1045 ) 1046 elif any(isinstance(a, Variable) for a in args): ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_dataarray_vfunc(func, signature, join, exclude_dims, keep_attrs, args) 232 233 data_vars = [getattr(a, "variable", a) for a in args] --> 234 result_var = func(data_vars) 235 236 if signature.num_outputs > 1: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, output_sizes, keep_attrs, args) 601 "apply_ufunc: {}".format(dask) 602 ) --> 603 result_data = func(input_data) 604 605 if signature.num_outputs == 1: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in func(arrays) 591 signature, 592 output_dtypes, --> 593 output_sizes, 594 ) 595 ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/xarray/core/computation.py in _apply_blockwise(func, args, input_dims, output_dims, signature, output_dtypes, output_sizes) 721 dtype=dtype, 722 concatenate=True, --> 723 new_axes=output_sizes 724 ) 725 ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/blockwise.py in blockwise(func, out_ind, name, token, dtype, adjust_chunks, new_axes, align_arrays, concatenate, meta, args, kwargs) 231 from .utils import compute_meta 232 --> 233 meta = compute_meta(func, dtype, args[::2], *kwargs) 234 if meta is not None: 235 return Array(graph, out, chunks, meta=meta) ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/dask/array/utils.py in compute_meta(func, _dtype, args, *kwargs) 119 # with np.vectorize, such as dask.array.routines._isnonzero_vec(). 120 if isinstance(func, np.vectorize): --> 121 meta = func(args_meta) 122 else: 123 try: ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in __call__(self, args, *kwargs) 2089 vargs.extend([kwargs[_n] for _n in names]) 2090 -> 2091 return self._vectorize_call(func=func, args=vargs) 2092 2093 def _get_ufunc_and_otypes(self, func, args): ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call(self, func, args) 2155 """Vectorized call to `func` over positional `args`.""" 2156 if self.signature is not None: -> 2157 res = self._vectorize_call_with_signature(func, args) 2158 elif not args: 2159 res = func() ~/miniconda/envs/euc_dynamics/lib/python3.7/site-packages/numpy/lib/function_base.py in _vectorize_call_with_signature(self, func, args) 2229 for dims in output_core_dims 2230 for dim in dims): -> 2231 raise ValueError('cannot call `vectorize` with a signature ' 2232 'including new output dimensions on size 0 ' 2233 'inputs') ValueError: cannot call `vectorize` with a signature including new output dimensions on size 0 inputs This used to work like a charm...I however was sloppy in testing this functionality (a good reminder always to write tests immediately 🙄 ), and I was not able to determine a combination of dependencies that would work. I am still experimenting and will report back Could this behaviour be a bug introduced in dask at some point (as indicated by @smartass101 above)? cc'ing @dcherian @shoyer @mrocklin EDIT: I can confirm that it seems to be a dask issue. If I restrict my dask version to `<2.0`, my tests (very similar to the above example) work.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910
558616375	https://github.com/pydata/xarray/issues/3574#issuecomment-558616375	https://api.github.com/repos/pydata/xarray/issues/3574	MDEyOklzc3VlQ29tbWVudDU1ODYxNjM3NQ==	smartass101 941907	2019-11-26T12:56:47Z	2019-11-26T12:56:47Z	NONE	Another approach would be to bypass `compute_meta` in `dask.blockwise` if `dtype` is provided which seems to be hinted at here https://github.com/dask/dask/blob/3960c6518318f2417658c2fc47cd5b5ece726f8b/dask/array/blockwise.py#L234 Perhaps this is an oversight in `dask`, what do you think?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	apply_ufunc with dask='parallelized' and vectorize=True fails on compute_meta 528701910

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);