github: issue_comments: 3 rows where issue = 479942077 and user = 1634164 sorted by updated

3 rows where issue = 479942077 and user = 1634164 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1534695467	https://github.com/pydata/xarray/issues/3213#issuecomment-1534695467	https://api.github.com/repos/pydata/xarray/issues/3213	IC_kwDOAMm_X85beZgr	khaeru 1634164	2023-05-04T12:31:22Z	2023-05-04T12:31:22Z	NONE	That's a totally valid scope limitation for the sparse package, and I understand the motivation. I'm just saying that the principle of least astonishment is not being followed: the user cannot at the moment read either the xarray or sparse docs and know which portions of the xarray API will work when giving `…, sparse=True`, and which instead require a deliberate choice to densify, or see examples of how best to mix the two. It would be helpful to clarify—that's all.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	How should xarray use/support sparse arrays? 479942077
1534231523	https://github.com/pydata/xarray/issues/3213#issuecomment-1534231523	https://api.github.com/repos/pydata/xarray/issues/3213	IC_kwDOAMm_X85bcoPj	khaeru 1634164	2023-05-04T07:40:26Z	2023-05-04T07:40:26Z	NONE	@jbbutler please also see this comment et seq. https://github.com/pydata/sparse/issues/1#issuecomment-792342987 and related pydata/sparse#438. To add to @rabernat's point about sparse support being "not well documented", I suspect (but don't know, as I'm just a user of xarray, not a developer) that it's also not thoroughly tested. I expected to be able to use e.g. `DataArray.cumprod` when the underlying data was sparse, but could not. IMHO, I/O to/from sparse-backed objects is less valuable if only a small subset of xarray functionality is available on those objects. Perhaps explicitly testing/confirming which parts of the API do/do not currently work with sparse would support the improvements to the docs that Ryan mentioned, and reveal the work remaining to provide full(er) support.	{ "total_count": 2, "+1": 2, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	How should xarray use/support sparse arrays? 479942077
520741706	https://github.com/pydata/xarray/issues/3213#issuecomment-520741706	https://api.github.com/repos/pydata/xarray/issues/3213	MDEyOklzc3VlQ29tbWVudDUyMDc0MTcwNg==	khaeru 1634164	2019-08-13T08:31:30Z	2019-08-13T08:31:30Z	NONE	This is very exciting! In energy-economic research (unlike, e.g., earth systems research), data are almost always sparse, so first-class sparse support will be broadly useful. I'm leaving a comment here (since this seems to be a meta-issue; please link from wherever else, if needed) with two example use-cases. For the moment, #3206 seems to cover them, so I can't name any specific additional features. MESSAGEix is an energy systems optimization model framework, formulated as a linear program. Some variables have many dimensions, for instance, the input coefficient for a technology has the dimensions `(node_loc, technology, year_vintage, year_active, mode, node_origin, commodity, level, time, time_origin)`. In the global version of our model, the `technology` dimension has over 400 labels. Often two or more dimensions are tied, eg `technology='coal power plant'` will only take input from `(commodity='coal', level='primary energy')`; all other combinations of `(commodity, level)` are empty for this `technology`. So, this data is inherently sparse. For modeling research, specifying quantities in this way is a good design because (a) it is intuitive to researchers in this domain, and (b) the optimization model is solved using various LP solvers via GAMS, which automatically prune zero rows in the resulting matrices. When we were developing a dask/DAG-based system for model results post-processing, we wanted to use xarray, but had some quantities with tens of millions of elements that were less than 1% full. Here is some test code that triggered MemoryErrors using xarray. We chose to fall back on using a pd.Series subclass that mocks xarray methods. In transportation research, stock models of vehicle fleets are often used. These models always have at least two time dimensions: `cohort` (the time period in which a vehicle was sold) and `period`(s) in which it is used (and thus consumes fuel, etc.). Since a vehicle sold in 2020 can't be used in 2015, these data are always triangular w.r.t. these two dimensions. (The dimensions `year_vintage` and `year_active` in example #1 above have the same relationship.) Once multiplied by other dimensions (technology; fuel; size or shape or market segment; embodied materials; different variables; model runs across various scenarios or input assumptions) the overhead of dense arrays can become problematic.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	How should xarray use/support sparse arrays? 479942077

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);