github: issue_comments: 1 row where author_association = "NONE", issue = 520815068 and user = 12912489 sorted by updated

1 row where author_association = "NONE", issue = 520815068 and user = 12912489 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	performed_via_github_app	issue
552756428	https://github.com/pydata/xarray/issues/3509#issuecomment-552756428	https://api.github.com/repos/pydata/xarray/issues/3509	MDEyOklzc3VlQ29tbWVudDU1Mjc1NjQyOA==	SimonHeybrock 12912489	2019-11-12T06:37:20Z	2022-09-09T13:08:45Z	NONE	@jthielen Thanks for your reply! I am not familiar with `pint` and `uncertainties` so I cannot go in much detail there, so this is just generally speaking: Units I do not see any advantage using scipp. The current unit system in scipp is based on `boost::units`, which is very powerful (supporting custom units, heterogeneous systems, ...), but unfortunately it is a compile-time library (EDIT 2022: This does not apply any more since we have long switched to a runtime units library). I would imagine we would need to wrap another library to become more flexible (we could even consider wrapping something like pint's unit implementation). Uncertainties There are two routes to take here: 1. Store a single array of value/variance pairs Propagation of uncertainties is "fast by default". Probably harder to vectorize (SIMD) since data layout implies interleaved values. In practice this is unlikely to be relevant, since many workloads are just limited by memory bandwidth and cache sizes, so vectorization is not crucial in my experience. 2. Store two arrays (values array and uncertainties array) This is what `scipp` does. Special care must be taken when implementing propagation of uncertainties: Naive implementation based on operating with arrays will lead to massive performance loss (I have seen 10x or more) for things like multiplication (there is no penalty for addition and subtraction). In practice this is not hard to do, we simply need to avoid computing the result's values and variances in two steps and put everything into a single loop. This avoids allocation of temporaries and loading / storing from memory multiple times. Scipp does this, and does not sacrifice any performance. Save 2x in performance when operating only with values, even if variances are present. Can add/remove variances independently, e.g., if no longer needed, avoiding copies. Can use existing `numpy` code to operate directly with values and variances (could probably be done in case 1., with a stride, loosing some efficiency). Other aspects Scipp supports a generic `transform`-type operation that can apply an arbitrary lambda to variables (units + values array + variances array). - This is done at compile-time and therefore static. It does however allow for very quick addition of new compound operations that propagate units and uncertainties. - For example, we could generate an operation `sqrt(aa + bb)`: - automatically written using a single loop => fast - gives the correct output units - propagates uncertainties - does all the broadcasting and transposing - Not using expression templates, in case anyone asks. Other `scipp.Variable` includes the dimension labels and operations can do broadcasting and transposition, yielding good performance. I am not sure if this an advantage or a drawback in this case? Would need to look more into the inner workings of xarray and the `__array_function__` protocol. Scipp is written in C++ with performance in mind. That being said, it is not terribly difficult to achieve good performance in these cases since many workloads are bound by memory bandwidth (and probably dozens of other libraries have done so). Questions What is pint's approach to uncertainties? Have you looked at the performance? Is performance relevant for you in these cases?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		NEP 18, physical units, uncertainties, and the scipp library? 520815068

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

1 row where author_association = "NONE", issue = 520815068 and user = 12912489 sorted by updated_at descending

Units

Uncertainties

1. Store a single array of value/variance pairs

2. Store two arrays (values array and uncertainties array)

Other aspects

Other

Questions

Advanced export