home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

17 rows where issue = 287223508 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 9

  • bradyrx 4
  • andersy005 3
  • rabernat 2
  • shoyer 2
  • dcherian 2
  • mrocklin 1
  • stefraynaud 1
  • jhamman 1
  • kmuehlbauer 1

author_association 2

  • MEMBER 12
  • CONTRIBUTOR 5

issue 1

  • apply_ufunc(dask='parallelized') with multiple outputs · 17 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
628135082 https://github.com/pydata/xarray/issues/1815#issuecomment-628135082 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYyODEzNTA4Mg== bradyrx 8881170 2020-05-13T17:27:06Z 2020-05-13T17:27:06Z CONTRIBUTOR

So would you be re-doing the same computation by running .compute() separately on these objects?

Yes. but you can do dask.compute(xarray_obj1, xarray_obj2,...) or combine those objects appropriately into a Dataset and then call compute on that.

Good call. I figured there was a workaround.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
628088800 https://github.com/pydata/xarray/issues/1815#issuecomment-628088800 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYyODA4ODgwMA== dcherian 2448579 2020-05-13T16:04:20Z 2020-05-13T16:04:20Z MEMBER

So would you be re-doing the same computation by running .compute() separately on these objects?

Yes. but you can do dask.compute(xarray_obj1, xarray_obj2,...) or combine those objects appropriately into a Dataset and then call compute on that.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
628070696 https://github.com/pydata/xarray/issues/1815#issuecomment-628070696 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYyODA3MDY5Ng== bradyrx 8881170 2020-05-13T15:33:56Z 2020-05-13T15:33:56Z CONTRIBUTOR

One issue I see is that this would return multiple dask objects, correct? So to get the results from them, you'd have to run .compute() on each separately. I think it's a valid assumption to expect that the multiple output objects would share a lot of the same computational pipeline. So would you be re-doing the same computation by running .compute() separately on these objects?

The earlier mentioned code snippets provide a nice path forward, since you can just run compute on one object, and then split its result (or however you name it) dimension into multiple individual objects. Thoughts?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
628050521 https://github.com/pydata/xarray/issues/1815#issuecomment-628050521 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYyODA1MDUyMQ== dcherian 2448579 2020-05-13T15:02:01Z 2020-05-13T15:02:01Z MEMBER

Still needs to be implemented. Stephan's comment suggests a path forward (https://github.com/pydata/xarray/issues/1815#issuecomment-440089606)

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
627869278 https://github.com/pydata/xarray/issues/1815#issuecomment-627869278 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYyNzg2OTI3OA== kmuehlbauer 5821660 2020-05-13T09:33:59Z 2020-05-13T09:33:59Z MEMBER

I think ideally it would be nice to return multiple DataArrays or a Dataset of variables.

What's the current status of this? I've similar requirements, single DataArray as input, multiple DataArrays as output.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
614244205 https://github.com/pydata/xarray/issues/1815#issuecomment-614244205 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYxNDI0NDIwNQ== bradyrx 8881170 2020-04-15T19:45:50Z 2020-04-15T19:45:50Z CONTRIBUTOR

I think ideally it would be nice to return multiple DataArrays or a Dataset of variables. But I'm really happy with this solution. I'm using it on a 600GB dataset of particle trajectories and was able to write a ufunc to go through and return each particle's x, y, z location when it met a certain condition.

I think having something simple like the stackoverflow snippet I posted would be great for the docs as an apply_ufunc example. I'd be happy to lead this if folks think it's a good idea.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
614221197 https://github.com/pydata/xarray/issues/1815#issuecomment-614221197 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYxNDIyMTE5Nw== andersy005 13301940 2020-04-15T18:59:23Z 2020-04-15T18:59:23Z MEMBER

I imagine you're far past this now. And this might have been related to discussions with Genevieve and I anyways.

Thank you for the update, @bradyrx! Yes, it was related to discussions with @gelsworth

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
614216243 https://github.com/pydata/xarray/issues/1815#issuecomment-614216243 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDYxNDIxNjI0Mw== bradyrx 8881170 2020-04-15T18:49:51Z 2020-04-15T18:49:51Z CONTRIBUTOR

This looks essentially the same to @stefraynaud's answer, but I came across this stackoverflow response here: https://stackoverflow.com/questions/52094320/with-xarray-how-to-parallelize-1d-operations-on-a-multidimensional-dataset.

@andersy005, I imagine you're far past this now. And this might have been related to discussions with Genevieve and I anyways.

```python def new_linregress(x, y): # Wrapper around scipy linregress to use in apply_ufunc slope, intercept, r_value, p_value, std_err = stats.linregress(x, y) return np.array([slope, intercept, r_value, p_value, std_err])

return a new DataArray

stats = xr.apply_ufunc(new_linregress, ds[x], ds[y], input_core_dims=[['year'], ['year']], output_core_dims=[["parameter"]], vectorize=True, dask="parallelized", output_dtypes=['float64'], output_sizes={"parameter": 5}, ) ```

{
    "total_count": 3,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 3,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
539134912 https://github.com/pydata/xarray/issues/1815#issuecomment-539134912 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDUzOTEzNDkxMg== rabernat 1197350 2019-10-07T18:06:03Z 2019-10-07T18:06:03Z MEMBER

I definitely don't have bandwidth! I'm happy to see you working on it.

On Mon, Oct 7, 2019 at 2:01 PM Anderson Banihirwe notifications@github.com wrote:

is what you are working on related to #3349 https://github.com/pydata/xarray/issues/3349?

@rabernat https://github.com/rabernat, indeed! Let me know if you have bandwidth to take on the polyfit implementation in xarray. Otherwise, I'd be happy to help out/work on it late October.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1815?email_source=notifications&email_token=AAJEKJWGRLZODWGS57UF6I3QNN2RJA5CNFSM4ELBUQLKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEARIK5I#issuecomment-539133301, or mute the thread https://github.com/notifications/unsubscribe-auth/AAJEKJQSAPJ3DAOTH24T7RTQNN2RJANCNFSM4ELBUQLA .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
539133301 https://github.com/pydata/xarray/issues/1815#issuecomment-539133301 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDUzOTEzMzMwMQ== andersy005 13301940 2019-10-07T18:01:55Z 2019-10-07T18:01:55Z MEMBER

is what you are working on related to #3349?

@rabernat, indeed! Let me know if you have bandwidth to take on the polyfit implementation in xarray. Otherwise, I'd be happy to help out/work on it late October.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
539025502 https://github.com/pydata/xarray/issues/1815#issuecomment-539025502 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDUzOTAyNTUwMg== rabernat 1197350 2019-10-07T14:00:01Z 2019-10-07T14:00:01Z MEMBER

@andersy005 - is what you are working on related to #3349?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
538993551 https://github.com/pydata/xarray/issues/1815#issuecomment-538993551 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDUzODk5MzU1MQ== stefraynaud 1941408 2019-10-07T12:48:01Z 2019-10-07T12:48:01Z CONTRIBUTOR

@andersy005 here is a very little demo of linear regression using lstsq (not linregress) in which only slope and intercept are kept. It is here applied to an array of sea surface temperature. I hope it can help.

python ds = xr.open_dataset('sst_2D.nc', chunks={'X': 30, 'Y': 30}) def ulinregress(x, y): # the universal function ny, nx, nt = y.shape ; y = np.moveaxis(y, -1, 0).reshape((nt, -1)) # nt, ny*nx return np.linalg.lstsq(np.vstack([x, np.ones(nt)]).T, y)[0].T.reshape(ny, nx, 2) time = (ds['time'] - np.datetime64("1950-01-01")) / np.timedelta64(1, 'D') ab = xr.apply_ufunc(ulinregress, time, ds['sst'], dask='parallelized', input_core_dims=[['time'], ['time']], output_dtypes=['d'], output_sizes={'coef': 2, }, output_core_dims=[['coef']]) series = ds['sst'][:, 0, 0].load() line = series.copy() ; line[:] = ab[0, 0, 0] * time + ab[0, 0, 1] series.plot(label='Original') ; line.plot(label='Linear regression') ; plt.legend();

{
    "total_count": 1,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 1,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
536847249 https://github.com/pydata/xarray/issues/1815#issuecomment-536847249 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDUzNjg0NzI0OQ== andersy005 13301940 2019-10-01T03:39:15Z 2019-10-01T03:39:15Z MEMBER

Any updates or progress here? I’m trying to use xarray.apply_ufunc with scipy.stats.linregress which returns - slope - intercept - rvalue - pvalue - stderr

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
440089606 https://github.com/pydata/xarray/issues/1815#issuecomment-440089606 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDQ0MDA4OTYwNg== shoyer 1217238 2018-11-20T00:16:31Z 2018-11-20T00:16:31Z MEMBER

I think we can do this inside the existing xarray.apply_ufunc, simply by using apply_gufunc instead of atop for the case where signature.num_outputs > 1 (which current raises NotImplementedError): https://github.com/pydata/xarray/blob/master/xarray/core/computation.py#L601

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
440064660 https://github.com/pydata/xarray/issues/1815#issuecomment-440064660 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDQ0MDA2NDY2MA== mrocklin 306380 2018-11-19T22:27:31Z 2018-11-19T22:27:31Z MEMBER

FYI @magonser

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
440061760 https://github.com/pydata/xarray/issues/1815#issuecomment-440061760 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDQ0MDA2MTc2MA== jhamman 2443309 2018-11-19T22:17:00Z 2018-11-19T22:17:39Z MEMBER

@shoyer - dask now has a apply_gufunc. Is this something we should try to include in xr.apply_ufunct or a new function xr.apply_gufunc?

xref: https://github.com/dask/dask/pull/3109

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508
356410108 https://github.com/pydata/xarray/issues/1815#issuecomment-356410108 https://api.github.com/repos/pydata/xarray/issues/1815 MDEyOklzc3VlQ29tbWVudDM1NjQxMDEwOA== shoyer 1217238 2018-01-09T20:51:18Z 2018-01-09T20:51:18Z MEMBER

We need atop to support multiple output arguments (https://github.com/dask/dask/issues/702), or potentially a specialized wrapper for generalized ufuncs in dask (https://github.com/dask/dask/issues/1176).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  apply_ufunc(dask='parallelized') with multiple outputs 287223508

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 17.072ms · About: xarray-datasette