github: issue_comments: 5 rows where issue = 437765416 and user = 7441788 sorted by updated

5 rows where issue = 437765416 and user = 7441788 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
601885539	https://github.com/pydata/xarray/pull/2922#issuecomment-601885539	https://api.github.com/repos/pydata/xarray/issues/2922	MDEyOklzc3VlQ29tbWVudDYwMTg4NTUzOQ==	seth-p 7441788	2020-03-20T19:57:54Z	2020-03-20T20:00:20Z	CONTRIBUTOR	All good points: What could be done, though is to only do da = da.fillna(0.0) if da contains NaNs. Good idea, though I don't know what the performance hit would be of the extra check (in the case that da does contain NaNs, so the check is for naught). I assume so. I don't know what kind of temporary variables np.einsum creates. Also np.einsum is wrapped in xr.apply_ufunc so all kinds of magic is going on. Well, `(da * weights)` will be at least as large as `da`. I'm not certain, but I don't think np.einsum creates huge temporary arrays. Do you want to leave it away for performance reasons? Because it was a deliberate decision to not support NaNs in the weights and I don't think this is going to change. Yes. You can continue not supporting NaNs in the weights, yet not explicitly check that there are no NaNs (optionally, if the caller assures you that there are no NaNs). None of your suggested functions support NaNs so they won't work. Correct. These have nothing to do with the NaNs issue. For profiling memory usage, I use `psutil.Process(os.getpid()).memory_info().rss` for current usage and `resource.getusage(resource.RUSAGE_SElF).ru_maxrss` for peak usage (on linux).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/weighted 437765416
601709733	https://github.com/pydata/xarray/pull/2922#issuecomment-601709733	https://api.github.com/repos/pydata/xarray/issues/2922	MDEyOklzc3VlQ29tbWVudDYwMTcwOTczMw==	seth-p 7441788	2020-03-20T13:47:39Z	2020-03-20T16:31:14Z	CONTRIBUTOR	@mathause, have you considered using these functions? - np.average() to calculate weighted `mean()`. - np.cov() to calculate weighted `cov()`, `var()`, and `std()`. - sp.stats.cumfreq() to calculate weighted `median()` (I haven't thought this through). - sp.spatial.distance.correlation() to calculate weighted `corrcoef()`. (Of course one could also calculate this from weighted `cov()` (see above), but first need to mask the two arrays simultaneously.) - sklearn.utils.extmath.weighted_mode() to calculate weighted `mode()`. - gmisclib.weighted_percentile.{wp,wtd_median}() to calculate weighted `quantile()` and `median()`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/weighted 437765416
601708110	https://github.com/pydata/xarray/pull/2922#issuecomment-601708110	https://api.github.com/repos/pydata/xarray/issues/2922	MDEyOklzc3VlQ29tbWVudDYwMTcwODExMA==	seth-p 7441788	2020-03-20T13:44:03Z	2020-03-20T13:52:06Z	CONTRIBUTOR	@mathause, ideally `dot()` would support `skipna`, so you could eliminate the `da = da.fillna(0.0)` and pass the `skipna` down the line. But alas it doesn't... `(da * weights).sum(dim=dim, skipna=skipna)` would likely make things worse, I think, as it would necessarily create a temporary array of sized at least `da`, no? Either way, this only addresses the `da = da.fillna(0.0)`, not the `mask = da.notnull()`. Also, perhaps the test `if weights.isnull().any()` in `Weighted.__init__()` should be optional? Maybe I'm more sensitive to this than others, but I regularly deal with 10-100GB arrays.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/weighted 437765416
601699091	https://github.com/pydata/xarray/pull/2922#issuecomment-601699091	https://api.github.com/repos/pydata/xarray/issues/2922	MDEyOklzc3VlQ29tbWVudDYwMTY5OTA5MQ==	seth-p 7441788	2020-03-20T13:25:21Z	2020-03-20T13:25:21Z	CONTRIBUTOR	@max-sixty, I wish I could, but I'm afraid that I cannot submit code due to employer limitations.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/weighted 437765416
601496897	https://github.com/pydata/xarray/pull/2922#issuecomment-601496897	https://api.github.com/repos/pydata/xarray/issues/2922	MDEyOklzc3VlQ29tbWVudDYwMTQ5Njg5Nw==	seth-p 7441788	2020-03-20T02:11:53Z	2020-03-20T02:12:24Z	CONTRIBUTOR	I realize this is a bit late, but I'm still concerned about memory usage, specifically in https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L130 and https://github.com/pydata/xarray/blob/master/xarray/core/weighted.py#L143. If `da.sizes = {'dim_0': 100000, 'dim_1': 100000}`, the two lines above will cause `da.weighted(weights).mean('dim_0')` to create two simultaneous temporary 100000x100000 arrays, which could be problematic. I would have implemented this using `apply_ufunc`, so that one creates these temporary variables only on as small an array as absolutely necessary -- in this case just of size `sizes['dim_0'] = 100000`. (Much as I would like to, I'm afraid I'm not able to contribute code.) Of course this won't help in the case one is summing over all dimensions, but might as well minimize memory usage in some cases even if not in all.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Feature/weighted 437765416

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);