issue_comments: 476678007

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2852#issuecomment-476678007	https://api.github.com/repos/pydata/xarray/issues/2852	476678007	MDEyOklzc3VlQ29tbWVudDQ3NjY3ODAwNw==	1197350	2019-03-26T14:41:59Z	2019-03-26T14:41:59Z	MEMBER	label (y, x) uint16 dask.array<shape=(10980, 10980), chunksize=(200, 10980)> ... geoms_ds.groupby('label')` It is very hard to make this sort of groupby lazy, because you are grouping over the variable `label` itself. Groupby uses a split-apply-combine paradigm to transform the data. The apply and combine steps can be lazy. But the split step cannot. Xarray uses the group variable to determine how to index the array, i.e. which items belong in which group. To do this, it needs to read the whole variable into memory. In this specific example, it sounds like what you want is to compute the histogram of labels. That could be accomplished without groupby. For example, you could use apply_ufunc together with `dask.array.histogram`. So my recommendation is to think of a way to accomplish what you want that does not involve groupby.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		425320466