issues: 311578894

This data as json

id	node_id	number	title	user	state	locked	assignee	milestone	comments	created_at	updated_at	closed_at	author_association	active_lock_reason	draft	pull_request	body	reactions	performed_via_github_app	state_reason	repo	type
311578894	MDU6SXNzdWUzMTE1Nzg4OTQ=	2040	to_netcdf() to automatically switch to fixed-length strings for compressed variables	6213168	open	0			2	2018-04-05T11:50:16Z	2019-01-13T01:42:03Z		MEMBER				When you have fixed-length numpy arrays of unicode characters (<U...) in a dataset, and you invoke to_netcdf() without any particular encoding, they are automatically stored as variable-length strings, unless you explicitly specify `{'dtype': 'S1'}`. Is this in order to save disk space in case strings vary wildly in size? I may be able to see the point in this case. However, this approach is disastrous if variables are compressed, as any compression algorithm will reduce the zero-panning at the end of the strings to a negligible size. My test data: a dataset with \~50 variables, of which half are strings of 10\~100 english characters and the other half are floats, all on a single dimension with 12k points. Test 1: `ds.to_netcdf('uncompressed.nc')` Result: 45MB Test 2: `encoding = {k: {'gzip': True, 'shuffle': True} for k in ds.variables} ds.to_netcdf('bad-compression.nc', encoding=encoding)` Result: 42MB Test 3: `encoding = {} for k, v in ds.variables.items(): encoding[k] = {'gzip': True, 'shuffle': True} if v.dtype.kind == 'U': encoding[k]['dtype'] = 'S1' ds.to_netcdf('good-compression.nc', encoding=encoding)` Result: 5MB Proposal In case of string variables, if no dtype is explicitly defined, to_netcdf() should dynamically assign it to S1 if compression is enabled, str if disabled.	{ "url": "https://api.github.com/repos/pydata/xarray/issues/2040/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }			13221727	issue

Links from other tables

1 row from issues_id in issues_labels
2 rows from issue in issue_comments