github: issue_comments: 2 rows where issue = 651945063 sorted by updated

2 rows where issue = 651945063 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	performed_via_github_app	issue
1181905249	https://github.com/pydata/xarray/issues/4205#issuecomment-1181905249	https://api.github.com/repos/pydata/xarray/issues/4205	IC_kwDOAMm_X85GcnFh	dcherian 2448579	2022-07-12T15:25:05Z	2022-07-12T15:25:05Z	MEMBER	It makes sense to me that chunking along a dimension `dim` should not chunk variables that don't have that dimension. @shoyer what do you think	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		Chunking causes unrelated non-dimension coordinate to become a dask array 651945063
1176645772	https://github.com/pydata/xarray/issues/4205#issuecomment-1176645772	https://api.github.com/repos/pydata/xarray/issues/4205	IC_kwDOAMm_X85GIjCM	aulemahal 20629530	2022-07-06T20:00:09Z	2022-07-06T20:00:32Z	CONTRIBUTOR	I have the same problem in xarray 2022.3.0. The issue is that this creates unnecessary dask tasks in the graph and some operations acting on the coordinates unexpectedly trigger dask computations. "Unexpected" because the coordinates at the beginning of the process where not chunked. So computation that was expected to happen in the main thread (or not happen at all) is now happenning in the dask workers. An example: ```python3 import numpy as np import xarray as xr from dask.diagnostics import ProgressBar A 2D variable da = xr.DataArray( np.ones((12, 10)), dims=('x', 'y'), coords={'x': np.arange(12), 'y': np.arange(10)} ) A 1D variable sharing a dim with da db = xr.DataArray( np.ones((12,)), dims=('x'), coords={'x': np.arange(12)} ) A non-dimension coordinate cx = xr.DataArray(np.zeros((12,)), dims=('x',), coords={'x': np.arange(12)}) Assign it to da and db da = da.assign_coords(cx=cx) db = db.assign_coords(cx=cx) We need to chunk along y da = da.chunk({'y': 1}) Notice how `cx` is now a dask array, even if it is a 1D coordinate and does not have 'Y' as a dimension. print(da) This triggers a dask computation with ProgressBar(): da - db ``` The reason my example triggers dask is that xarray ensure the coordinates are aligned and equal (I think?). Anyway, I didn't expect it. Personally, I think the `chunk` method shouldn't apply to the coordinates at all, no matter their dimensions. They're coordinate so we expect to be able to read them easily when aligning/comparing dataset. Dask is to be used with the "real" data only. Does this vision fit the one from the devs? I feel this "skip" could be easily implemented.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		Chunking causes unrelated non-dimension coordinate to become a dask array 651945063

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

2 rows where issue = 651945063 sorted by updated_at descending

A 2D variable

A 1D variable sharing a dim with da

A non-dimension coordinate

Assign it to da and db

We need to chunk along y

Notice how cx is now a dask array, even if it is a 1D coordinate and does not have 'Y' as a dimension.

This triggers a dask computation

Advanced export

Notice how `cx` is now a dask array, even if it is a 1D coordinate and does not have 'Y' as a dimension.