issue_comments: 1362507511

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/7397#issuecomment-1362507511	https://api.github.com/repos/pydata/xarray/issues/7397	1362507511	IC_kwDOAMm_X85RNjb3	5821660	2022-12-22T07:33:39Z	2022-12-22T07:33:39Z	MEMBER	IIUC the amount of memory is quite what the dimensions suggest (assuming 4byte dtype): (280 * 200 * 277 * 754 * 4 bytes) / 1024³ = 43.57 GB I'm not that familiar with the data flow in `to_netcdf` but it's clear that the whole data is read into memory for some reason. The error happens at backend level, so assuming engine=`netcdf4`. You might try with `engine="h5netcdf"` or consider @TomNicholas suggestion of using `to_zarr` to possibly get the backends out of the equation. Some questions @benoitespinola : Can you show the repr's of the single file Dataset's and the repr of the combined? Are your final data variables of that size (time: 280, depth: 200, lat: 277, lon: 754)? Did you do some processing with the data, changing attributes/encoding etc? Is it possible to create your source data files from scratch with random data? An MCVE showing that would help. Further suggestions: If you have multiple data variables, drop all but one prior to saving. Is the behaviour consistent for each of your variables? Try to be explicit in the call to `open_mfdataset` (eg. adding keyword `chunks` etc.). Try to open individual files and use `xr.merge`/`xr.concat`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		1506437087