issue_comments: 511583067

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/issues/2064#issuecomment-511583067	https://api.github.com/repos/pydata/xarray/issues/2064	511583067	MDEyOklzc3VlQ29tbWVudDUxMTU4MzA2Nw==	10638475	2019-07-15T21:48:15Z	2019-07-15T21:50:50Z	NONE	@dcherian . I believe you are correct in principle, but there is a logical problem that is expensive to evaluate. The difficult case is when two datasets have a variable with the same name, and that variable does not include the concatenation dimension. In order to align the datasets for concatenation, both variables would need to be identical. The resulting dataset would just have one (unchanged) instance of that variable, say from the first dataset. I think someone along the way decided this operation was too expensive. This is from concat.py, lines 302-307: `# stack up each variable to fill-out the dataset (in order) for k in datasets[0].variables: if k in concat_over: vars = ensure_common_dims([ds.variables[k] for ds in datasets]) combined = concat_vars(vars, dim, positions) insert_result_variable(k, combined)` So I think some consensus needs to be reached, about whether it is a good idea to load these variables into memory to check for identical-ness between them. Or another possibility is that we leave "unique" variables alone: if a variable exists only once across all datasets being concatenated, we do not add the concatenation dimension to it. This might solve @xylar original poster's issue when opening a single dataset.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		314764258