issue_comments: 444708274

This data as json

html_url	issue_url	id	node_id	user	created_at	updated_at	author_association	body	reactions	performed_via_github_app	issue
https://github.com/pydata/xarray/pull/2553#issuecomment-444708274	https://api.github.com/repos/pydata/xarray/issues/2553	444708274	MDEyOklzc3VlQ29tbWVudDQ0NDcwODI3NA==	35968931	2018-12-06T00:56:01Z	2018-12-06T01:01:49Z	MEMBER	Thanks for the comments. What happens if you have a nested list of Dataset objects with different data variables? This is supported. This new `auto_combine()` simply applies the old `auto_combine()` N times along N dimensions, so if the grid of results is auto-combinable along all it's dimensions separately, then the new `auto_combine()` will auto-magically combine it all, e.g: ```python objs = [[Dataset({'foo': ('x', [0, 1])}), Dataset({'bar': ('x', [10, 20])})], [Dataset({'foo': ('x', [2, 3])}), Dataset({'bar': ('x', [30, 40])})]] expected = Dataset({'foo': ('x', [0, 1, 2, 3]), 'bar': ('x', [10, 20, 30, 40])}) This works actual = auto_combine(objs, concat_dims=['x', None]) assert_identical(expected, actual) Also works auto-magically actual = auto_combine(objs) assert_identical(expected, actual) Proving it works symmetrically objs = [[Dataset({'foo': ('x', [0, 1])}), Dataset({'foo': ('x', [2, 3])})], [Dataset({'bar': ('x', [10, 20])}), Dataset({'bar': ('x', [30, 40])})]] actual = auto_combine(objs, concat_dims=[None, 'x']) assert_identical(expected, actual) ``` (I'll add this example as another unit test) I should point out that there is one way in which this function is not exactly as general as `auto_combine()` applied N times though. The options `compat`, `data_vars`, and `coords` are specified once for the combining along all dimensions, so you can't currently tell it to use `compat='identical'` along `dim1` and `compat='no_conflicts'` along `dim2`, it has to be the same for both. I thought about making these kwargs also accept lists, but although that would be easy to do it seemed like it would complicate the API for a very specific use case? It might be better to have a separate `nested_concat()` function rather than to squeeze this all into `auto_combine()`. That was basically what I tried to do in my first attempt, but nested concatenation without merging along every dimension misses some common use cases, for example if you wanted to `auto_combine` (or `open_mfdataset()`) files structured like `bash root ├─ time1 │ ├── density.nc │ └── temperature.nc └─ time2 ├── density.nc └── temperature.nc` then you would want to merge `density.nc` and `temperature.nc`, then concat along 'time'. I suppose you could have `nested_auto_combine()` as a separate function though. I don't really think that's necessary though, as apart from the substitution `concat_dim` -> `concat_dims` then this is fully backwards-compatible. You might also find it interesting to see how I've used this fork in my own code: I create the grid of datasets here, so that I can combine them here. I have a question actually - currently if the concat or merge fails, then the error message won't clearly tell you which dimension it was trying to combine along when it failed. Is there a way to do that easily with `try... except...` statements? Something like `python for dim in concat_dims: try: _auto_combine_along_first_dim(...) except MergeError or ValueError as err: raise ValueError(f"Encoutered {err} while trying to combine along dimension {dim}")` (Also something else is breaking in `cftime` on the python 2.7 builds on AppVeyor now...)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }		379415229