home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 316618290 and user = 3019665 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • jakirkham · 2 ✖

issue 1

  • xarray.dot() dask problems · 2 ✖

author_association 1

  • NONE 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
383723159 https://github.com/pydata/xarray/issues/2074#issuecomment-383723159 https://api.github.com/repos/pydata/xarray/issues/2074 MDEyOklzc3VlQ29tbWVudDM4MzcyMzE1OQ== jakirkham 3019665 2018-04-23T21:06:42Z 2018-04-23T21:06:42Z NONE

from what I understand da.dot implements... a limited special case of da.einsum?

Basically dot is an inner product. Certainly inner products can be formulated using Einstein notation (i.e. calling with einsum).

The question is whether the performance keeps up with that formulation. Currently it sounds like chunking causes some problems right now IIUC. However things like dot and tensordot dispatch through optimized BLAS routines. In theory einsum should do the same ( https://github.com/numpy/numpy/pull/9425 ), but the experimental data still shows a few warts. For example, matmul is implemented with einsum, but is slower than dot. ( https://github.com/numpy/numpy/issues/7569 ) ( https://github.com/numpy/numpy/issues/8957 ) Pure einsum implementations seem to perform similarly.

I ran a few more benchmarks...

What are the arrays used as input for this case?

...apparently xarray.dot on a dask backend is situationally faster than all other implementations when you are not reducing on any dimensions...

Having a little trouble following this. dot reduces one dimension from each input. Excepting if one of the inputs is 0-D (i.e. a scalar), then it is just multiplying a single scalar through an array. Is that what you are referring?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.dot() dask problems 316618290
383637379 https://github.com/pydata/xarray/issues/2074#issuecomment-383637379 https://api.github.com/repos/pydata/xarray/issues/2074 MDEyOklzc3VlQ29tbWVudDM4MzYzNzM3OQ== jakirkham 3019665 2018-04-23T16:26:51Z 2018-04-23T16:26:51Z NONE

Might be worth revisiting how da.dot is implemented as well. That would be the least amount of rewriting for you and would generally be nice for Dask users. If you have not already, @crusaderky, it would be nice to raise an issue over at Dask with a straight Dask benchmark comparing Dask Array's dot and einsum.

cc @mrocklin

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  xarray.dot() dask problems 316618290

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 2642.583ms · About: xarray-datasette