issues: 1554036799

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
1554036799	PR_kwDOAMm_X85IYHUz	7472	Avoid in-memory broadcasting when converting to_dask_dataframe	14371165	closed	0			1	2023-01-24T00:15:01Z	2023-01-26T17:00:24Z	2023-01-26T17:00:23Z	MEMBER		0	pydata/xarray/pulls/7472	Turns out that there's a call to `.set_dims` that forces a broadcast on the numpy coordinates. [x] Closes #6811 [x] Tests added, see #7474. [x] User visible changes (including notable bug fixes) are documented in `whats-new.rst` Debugging script: ```python import dask.array as da import xarray as xr import numpy as np chunks = 5000 # I have to restart the pc if running with this: # dim1_sz = 100_000 # dim2_sz = 100_000 # Does not crash when using the following constants, >5 gig RAM increase though: dim1_sz = 40_000 dim2_sz = 40_000 x = da.random.random((dim1_sz, dim2_sz), chunks=chunks) ds = xr.Dataset( { "x": xr.DataArray( data=x, dims=["dim1", "dim2"], coords={"dim1": np.arange(0, dim1_sz), "dim2": np.arange(0, dim2_sz)}, ) } ) # with dask.config.set(**{"array.slicing.split_large_chunks": True}): df = ds.to_dask_dataframe() print(df) ```	{ "url": "https://api.github.com/repos/pydata/xarray/issues/7472/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	pull

Links from other tables