html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1815#issuecomment-628135082,https://api.github.com/repos/pydata/xarray/issues/1815,628135082,MDEyOklzc3VlQ29tbWVudDYyODEzNTA4Mg==,8881170,2020-05-13T17:27:06Z,2020-05-13T17:27:06Z,CONTRIBUTOR,"> > So would you be re-doing the same computation by running .compute() separately on these objects?
>
> Yes. but you can do `dask.compute(xarray_obj1, xarray_obj2,...)` or combine those objects appropriately into a Dataset and then call compute on that.
Good call. I figured there was a workaround.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287223508
https://github.com/pydata/xarray/issues/1815#issuecomment-628070696,https://api.github.com/repos/pydata/xarray/issues/1815,628070696,MDEyOklzc3VlQ29tbWVudDYyODA3MDY5Ng==,8881170,2020-05-13T15:33:56Z,2020-05-13T15:33:56Z,CONTRIBUTOR,"One issue I see is that this would return multiple dask objects, correct? So to get the results from them, you'd have to run `.compute()` on each separately. I think it's a valid assumption to expect that the multiple output objects would share a lot of the same computational pipeline. So would you be re-doing the same computation by running `.compute()` separately on these objects?
The earlier mentioned code snippets provide a nice path forward, since you can just run compute on one object, and then split its `result` (or however you name it) dimension into multiple individual objects. Thoughts?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287223508
https://github.com/pydata/xarray/issues/1815#issuecomment-614244205,https://api.github.com/repos/pydata/xarray/issues/1815,614244205,MDEyOklzc3VlQ29tbWVudDYxNDI0NDIwNQ==,8881170,2020-04-15T19:45:50Z,2020-04-15T19:45:50Z,CONTRIBUTOR,"I think ideally it would be nice to return multiple DataArrays or a Dataset of variables. But I'm really happy with this solution. I'm using it on a 600GB dataset of particle trajectories and was able to write a ufunc to go through and return each particle's x, y, z location when it met a certain condition.
I think having something simple like the stackoverflow snippet I posted would be great for the docs as an `apply_ufunc` example. I'd be happy to lead this if folks think it's a good idea.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,287223508
https://github.com/pydata/xarray/issues/1815#issuecomment-614216243,https://api.github.com/repos/pydata/xarray/issues/1815,614216243,MDEyOklzc3VlQ29tbWVudDYxNDIxNjI0Mw==,8881170,2020-04-15T18:49:51Z,2020-04-15T18:49:51Z,CONTRIBUTOR,"This looks essentially the same to @stefraynaud's answer, but I came across this stackoverflow response here: https://stackoverflow.com/questions/52094320/with-xarray-how-to-parallelize-1d-operations-on-a-multidimensional-dataset.
@andersy005, I imagine you're far past this now. And this might have been related to discussions with Genevieve and I anyways.
```python
def new_linregress(x, y):
# Wrapper around scipy linregress to use in apply_ufunc
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
return np.array([slope, intercept, r_value, p_value, std_err])
# return a new DataArray
stats = xr.apply_ufunc(new_linregress, ds[x], ds[y],
input_core_dims=[['year'], ['year']],
output_core_dims=[[""parameter""]],
vectorize=True,
dask=""parallelized"",
output_dtypes=['float64'],
output_sizes={""parameter"": 5},
)
```","{""total_count"": 3, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 3, ""rocket"": 0, ""eyes"": 0}",,287223508
https://github.com/pydata/xarray/issues/1815#issuecomment-538993551,https://api.github.com/repos/pydata/xarray/issues/1815,538993551,MDEyOklzc3VlQ29tbWVudDUzODk5MzU1MQ==,1941408,2019-10-07T12:48:01Z,2019-10-07T12:48:01Z,CONTRIBUTOR,"@andersy005 here is a very little demo of linear regression using lstsq (not linregress) in which only slope and intercept are kept. It is here applied to an array of sea surface temperature.
I hope it can help.
```python
ds = xr.open_dataset('sst_2D.nc', chunks={'X': 30, 'Y': 30})
def ulinregress(x, y): # the universal function
ny, nx, nt = y.shape ; y = np.moveaxis(y, -1, 0).reshape((nt, -1)) # nt, ny*nx
return np.linalg.lstsq(np.vstack([x, np.ones(nt)]).T, y)[0].T.reshape(ny, nx, 2)
time = (ds['time'] - np.datetime64(""1950-01-01"")) / np.timedelta64(1, 'D')
ab = xr.apply_ufunc(ulinregress, time, ds['sst'], dask='parallelized',
input_core_dims=[['time'], ['time']],
output_dtypes=['d'], output_sizes={'coef': 2, }, output_core_dims=[['coef']])
series = ds['sst'][:, 0, 0].load()
line = series.copy() ; line[:] = ab[0, 0, 0] * time + ab[0, 0, 1]
series.plot(label='Original') ; line.plot(label='Linear regression') ; plt.legend();
```","{""total_count"": 1, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 1, ""rocket"": 0, ""eyes"": 0}",,287223508