issue_comments
10 rows where issue = 316618290 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- xarray.dot() dask problems · 10 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
385116430 | https://github.com/pydata/xarray/issues/2074#issuecomment-385116430 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4NTExNjQzMA== | crusaderky 6213168 | 2018-04-27T23:13:20Z | 2018-04-27T23:13:20Z | MEMBER | Done the work - but we'll need to wait for dask 0.17.3 to integrate it |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383817119 | https://github.com/pydata/xarray/issues/2074#issuecomment-383817119 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzgxNzExOQ== | mrocklin 306380 | 2018-04-24T06:22:39Z | 2018-04-24T06:22:39Z | MEMBER | When doing benchmarks with things that might call BLAS operations in multiple threads I recommend setting the OMP_NUM_THREADS environment variable to 1. This will avoid oversubscription. On Mon, Apr 23, 2018 at 7:32 PM, Keisuke Fujii notifications@github.com wrote:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383754980 | https://github.com/pydata/xarray/issues/2074#issuecomment-383754980 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4Mzc1NDk4MA== | fujiisoup 6815844 | 2018-04-23T23:32:33Z | 2018-04-23T23:32:33Z | MEMBER | @crusaderky , Thanks for the detailed benchmarking. Further note:
In your example,
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383754250 | https://github.com/pydata/xarray/issues/2074#issuecomment-383754250 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4Mzc1NDI1MA== | shoyer 1217238 | 2018-04-23T23:28:27Z | 2018-04-23T23:28:27Z | MEMBER | +1 for using dask.array.einsum in xarray.dot. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383724765 | https://github.com/pydata/xarray/issues/2074#issuecomment-383724765 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzcyNDc2NQ== | crusaderky 6213168 | 2018-04-23T21:12:04Z | 2018-04-23T21:12:14Z | MEMBER |
See blob in the opening post
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383723159 | https://github.com/pydata/xarray/issues/2074#issuecomment-383723159 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzcyMzE1OQ== | jakirkham 3019665 | 2018-04-23T21:06:42Z | 2018-04-23T21:06:42Z | NONE |
Basically The question is whether the performance keeps up with that formulation. Currently it sounds like chunking causes some problems right now IIUC. However things like
What are the arrays used as input for this case?
Having a little trouble following this. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383711323 | https://github.com/pydata/xarray/issues/2074#issuecomment-383711323 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzcxMTMyMw== | crusaderky 6213168 | 2018-04-23T20:26:59Z | 2018-04-23T20:26:59Z | MEMBER | @jakirkham from what I understand Ok this is funny. I ran a few more benchmarks, and apparently ``` def bench(...): ... if not dims: print("a * b (numpy backend):") %timeit a.compute() * b.compute() print("a * b (dask backend):") %timeit (a * b).compute() bench(100, False, [], '...i,...i->...i') bench( 20, False, [], '...i,...i->...i') bench(100, True, [], '...i,...i->...i') bench( 20, True, [], '...i,...i->...i') ``` Output: ``` bench(100, False, [], ...i,...i->...i)
xarray.dot(numpy backend):
291 ms ± 5.15 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
numpy.einsum:
296 ms ± 10 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
xarray.dot(dask backend):
dimension 's' on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, rechunk into a single dask array chunk along this dimension, i.e., bench(20, False, [], ...i,...i->...i)
xarray.dot(numpy backend):
345 ms ± 6.02 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
numpy.einsum:
342 ms ± 4.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
xarray.dot(dask backend):
dimension 's' on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, rechunk into a single dask array chunk along this dimension, i.e., bench(100, True, [], ...i,...i->...i) xarray.dot(numpy backend): 477 ms ± 8.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) numpy.einsum: 514 ms ± 35.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) xarray.dot(dask backend): 241 ms ± 8.47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) dask.array.einsum: 497 ms ± 21.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (numpy backend) 439 ms ± 27.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (dask backend) 517 ms ± 41.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) bench(20, True, [], ...i,...i->...i) xarray.dot(numpy backend): 572 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) numpy.einsum: 563 ms ± 10.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) xarray.dot(dask backend): 268 ms ± 14.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) dask.array.einsum: 563 ms ± 5.11 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (numpy backend) 501 ms ± 5.46 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (dask backend) 922 ms ± 93.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` This particular bit is shocking and I can't wrap my head around it?!? ``` bench(100, True, [], ...i,...i->...i) xarray.dot(dask backend): 241 ms ± 8.47 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (dask backend) 517 ms ± 41.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) bench(20, True, [], ...i,...i->...i) xarray.dot(dask backend): 268 ms ± 14.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) a * b (dask backend) 922 ms ± 93.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383651390 | https://github.com/pydata/xarray/issues/2074#issuecomment-383651390 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzY1MTM5MA== | mrocklin 306380 | 2018-04-23T17:12:04Z | 2018-04-23T17:12:04Z | MEMBER | { "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | ||
383637379 | https://github.com/pydata/xarray/issues/2074#issuecomment-383637379 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzYzNzM3OQ== | jakirkham 3019665 | 2018-04-23T16:26:51Z | 2018-04-23T16:26:51Z | NONE | Might be worth revisiting how cc @mrocklin |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 | |
383419435 | https://github.com/pydata/xarray/issues/2074#issuecomment-383419435 | https://api.github.com/repos/pydata/xarray/issues/2074 | MDEyOklzc3VlQ29tbWVudDM4MzQxOTQzNQ== | fujiisoup 6815844 | 2018-04-22T23:05:39Z | 2018-04-22T23:06:05Z | MEMBER |
Agreed.
I think the reimplementation would be easy,
https://github.com/pydata/xarray/blob/99b457ce5859bd949cfea4671db5150c7297843a/xarray/core/computation.py#L1039-L1043
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
xarray.dot() dask problems 316618290 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 5