github: issue_comments: 10 rows where issue = 1575938277 sorted by updated

10 rows where issue = 1575938277 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
1532601237	https://github.com/pydata/xarray/issues/7516#issuecomment-1532601237	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85bWaOV	Thomas-Z 1492047	2023-05-03T07:58:22Z	2023-05-03T07:58:22Z	CONTRIBUTOR	Hello, I'm not sure performances problematics were fully addressed (we're now forced to fully compute/load the selection expression) but changes made in the last versions makes this issue irrelevant and I think we can close it. Thank you!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1451754167	https://github.com/pydata/xarray/issues/7516#issuecomment-1451754167	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WiAK3	Thomas-Z 1492047	2023-03-02T11:59:47Z	2023-03-02T11:59:47Z	CONTRIBUTOR	The `.variable` computation is fast but it cannot be directly used like you suggest: ``` dsx.where(sel.variable, drop=True) TypeError: cond argument is <xarray.Variable (num_lines: 5761870, num_pixels: 71)> ... but must be a <class 'xarray.core.dataset.Dataset'> or <class 'xarray.core.dataarray.DataArray'> ``` Doing it like this seems to be working correctly (and is fast enough): `dsx["x"]= sel.variable.compute() dsx.where(dsx["x"], drop=True)` `_nadir` variables have the same chunks and are way faster to read than the other ones (lot smaller).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1450712889	https://github.com/pydata/xarray/issues/7516#issuecomment-1450712889	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WeB85	dcherian 2448579	2023-03-01T19:10:15Z	2023-03-01T19:10:15Z	MEMBER	Yeah that was another change I guess. We could extract out the variable using `.variable`. `.where(sel2.variable.compute(), drop=True)` do your `"_nadir"` variables have smaller chunk sizes or are slower to read for some reason?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1449714522	https://github.com/pydata/xarray/issues/7516#issuecomment-1449714522	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WaONa	Thomas-Z 1492047	2023-03-01T09:43:27Z	2023-03-01T09:43:27Z	CONTRIBUTOR	`sel = (dsx["longitude"] > 0) & (dsx["longitude"] < 100) sel.compute()` This "compute" finishes and takes more than 80sec on both versions with a huge memory consumption (it loads the 4 coordinates and the result itself). I know xarray has to keep more information regarding coordinates and dimensions but doing this (just dask arrays) : `sel2 = (dsx["longitude"].data > 0) & (dsx["longitude"].data < 100) sel2.compute()` Takes less than 6 seconds.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1449085012	https://github.com/pydata/xarray/issues/7516#issuecomment-1449085012	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WX0hU	dcherian 2448579	2023-02-28T23:30:59Z	2023-02-28T23:30:59Z	MEMBER	Does `sel.compute()` not finish?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1447798846	https://github.com/pydata/xarray/issues/7516#issuecomment-1447798846	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WS6g-	Thomas-Z 1492047	2023-02-28T08:54:16Z	2023-02-28T11:24:11Z	CONTRIBUTOR	Just tried it and it does not seem identical at all to what was happening earlier. This is the kind of dataset I'm working With this selection: `sel = (dsx["longitude"] > 0) & (dsx["longitude"] < 100)` Old xarray takes a little less that 1 minute and less than 6GB of memory. New xarray with compute did not finish and had to be stopped before consuming my 16GB of memory.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1447565936	https://github.com/pydata/xarray/issues/7516#issuecomment-1447565936	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WSBpw	dcherian 2448579	2023-02-28T04:41:03Z	2023-02-28T04:41:03Z	MEMBER	The old code had: `nonzeros = zip(clipcond.dims, np.nonzero(clipcond.values))` This loaded the array once and then passed numpy values to the indexing code. Now, the dask array is passed to the indexing code and is computed many times . #5873 raises an error saying boolean indexing with dask arrays is not allowed. For here just do `ds.where(sel.compute(), drop=True)`. It's identical to what was happening earlier. I think we should close this.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1447037080	https://github.com/pydata/xarray/issues/7516#issuecomment-1447037080	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WQAiY	headtr1ck 43316012	2023-02-27T20:27:52Z	2023-02-27T20:27:52Z	COLLABORATOR	I am a bit puzzled here... The dask graph looks identical, so it must be the way the indexers are constructed. The major difference I can find is: The old version used `np.unique` while the new version uses xarrays `cond.any(..)` Maybe someone with more experience in dask can help out?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1445469752	https://github.com/pydata/xarray/issues/7516#issuecomment-1445469752	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WKB44	headtr1ck 43316012	2023-02-26T21:16:35Z	2023-02-26T21:16:35Z	COLLABORATOR	Git bisect pinpoints this to https://github.com/pydata/xarray/pull/6690 which funny enough, is my PR haha. I will look into it when I find time :)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277
1445467918	https://github.com/pydata/xarray/issues/7516#issuecomment-1445467918	https://api.github.com/repos/pydata/xarray/issues/7516	IC_kwDOAMm_X85WKBcO	headtr1ck 43316012	2023-02-26T21:07:56Z	2023-02-26T21:07:56Z	COLLABORATOR	Can confirm, on my machine it went from 520ms to 5s	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	Dataset.where performances regression. 1575938277

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);