html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/2553#issuecomment-447048979,https://api.github.com/repos/pydata/xarray/issues/2553,447048979,MDEyOklzc3VlQ29tbWVudDQ0NzA0ODk3OQ==,1217238,2018-12-13T17:16:16Z,2018-12-13T17:16:16Z,MEMBER,thanks @TomNicholas! I'm looking forward to the follow-ups :),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-445907024,https://api.github.com/repos/pydata/xarray/issues/2553,445907024,MDEyOklzc3VlQ29tbWVudDQ0NTkwNzAyNA==,1217238,2018-12-10T17:48:21Z,2018-12-10T17:48:21Z,MEMBER,"> Okay good. So with this API then `auto_combine()` (and therefore the default behaviour of `open_mfdatset()`) would be to [throw a warning](https://github.com/pydata/xarray/issues/2159#issuecomment-435336049) if there isn't enough info in the dataset coords to order them?

Yes, but throwing a warning should probably only be a temporary solution. Long term we should pick a default value (e.g., `combine='auto'`) and require explicitly opting in to different behavior (e.g., `combine='manual'`).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-445892746,https://api.github.com/repos/pydata/xarray/issues/2553,445892746,MDEyOklzc3VlQ29tbWVudDQ0NTg5Mjc0Ng==,1217238,2018-12-10T17:06:59Z,2018-12-10T17:06:59Z,MEMBER,"> The only problem with that idea is that both of these functions should be options for `open_mfdataset()`, so you still have multiple possible argument structures for that function. I think that's okay though - it would be a lot easier to explain if you say ""if the keyword argument `combine='auto'` then the function `auto_combine()` is used, if `combine='manual'` then `nested_concat_and_merge()` is used, and the filepaths must be structured accordingly.""

This sounds pretty good to me.

> > If you like, we could also merge this work (which is excellent progress towards user-facing APIs) but keep the changes internal to xarray for now until we figure out the public APIs.
> 
> That would be great. Then I could start using the master branch of xarray again in [my code](https://github.com/boutproject/xBOUT/issues/22), while we redo the public API. If I set `concat_dims` back to `concat_dim` then it _should_ be completely backwards-compatible.

OK, do you want to go ahead and revert the documentation/public API for now?

I would even be OK supporting nested lists temporarily in xarray via APIs like `open_mfdataset` and `auto_concat` without official documentation/support -- using this feature would be ""at your own risk"". As long as we don't break any existing code, that is totally fine. We just shouldn't advertise an interface until we've happy with it.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-445419876,https://api.github.com/repos/pydata/xarray/issues/2553,445419876,MDEyOklzc3VlQ29tbWVudDQ0NTQxOTg3Ng==,1217238,2018-12-08T01:49:28Z,2018-12-10T17:00:04Z,MEMBER,"> I think that would mean there might be some situations that 1D auto_combine() could deal with but nested_concat_and_merge() could not. Situations where you can only merge the variables once they have been individually concatenated. (actually I'm not sure - I would need to think about that)

I *think* multiple levels `nested_concat_and_merge()` could always replace `auto_combine()` as long as merging happens last, at the outer-most level. That's basically what `auto_combine()` already does.

There are probably some edge cases where the existing `auto_combine` would you be a little more sloppy about how you put together your lists of variables. But I don't think that's really a good idea for us to support in xarray, since it removes most of the freedom from the implementation.

> I personally think a), but I expect users who haven't read the source code for auto_combine() might disagree.

Yes, this is my concern. The way this API handles nested lists makes sense from an implementation perspective, but not really from a user perspective. For users, I think it makes sense to have:
1. The precisely defined `nested_concat_and_merge()` for cases where you already data sorted. This is pretty common (e.g., with hierarchical directory structures).
2. A magical `auto_combine()` that inspects coordinates to figure out how to put everything together. We might even rename this to `combine()` (if we think it's comprehensive enough).

The original version of `auto_combine()` that I wrote was aiming towards this second use-case, but was obviously fragile and incomplete.

If you like, we could also merge this work (which is excellent progress towards user-facing APIs) but keep the changes internal to xarray for now until we figure out the public APIs.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-444713331,https://api.github.com/repos/pydata/xarray/issues/2553,444713331,MDEyOklzc3VlQ29tbWVudDQ0NDcxMzMzMQ==,1217238,2018-12-06T01:22:08Z,2018-12-06T01:22:17Z,MEMBER,"Do you think it would make sense to try to get rid of ""both merging and concatenating"" in favor of requiring another dimension in the grid? I guess we do sort of need this for this for `open_mfdataset` but it feels more complex than desired.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-444712974,https://api.github.com/repos/pydata/xarray/issues/2553,444712974,MDEyOklzc3VlQ29tbWVudDQ0NDcxMjk3NA==,1217238,2018-12-06T01:20:18Z,2018-12-06T01:20:18Z,MEMBER,"> Specifying dim=None will specify (1) explicitly, but otherwise yes, that's exactly what it does.

Yes, this was a typo on my part :).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-444712053,https://api.github.com/repos/pydata/xarray/issues/2553,444712053,MDEyOklzc3VlQ29tbWVudDQ0NDcxMjA1Mw==,1217238,2018-12-06T01:15:23Z,2018-12-06T01:19:54Z,MEMBER,"> if the grid of results is auto-combinable along all it's dimensions separately, then the new auto_combine() will auto-magically combine it all

OK, so the rule is that a grid dimension can correspond to any of:

1. merging data variables
2. concatenating along the dimension
3. both merging and concatenating

And to specify case (1) explicitly, you can write `dim=None` along that dimension?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-444580848,https://api.github.com/repos/pydata/xarray/issues/2553,444580848,MDEyOklzc3VlQ29tbWVudDQ0NDU4MDg0OA==,1217238,2018-12-05T17:56:26Z,2018-12-05T17:56:26Z,MEMBER,"Generally I like the look of this.

What happens if you have a nested list of Dataset objects with different data variables? e.g., a 2D grid of results, with different variables saved in different files. Is this supported at all?

I ask because the current version of `auto_combine()` was (for better or worse) designed to handle this automagically, but I'm not sure this makes sense for higher dimensional data.

It might be better to have a separate `nested_concat()` function rather than to squeeze this all into `auto_combine()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-443446419,https://api.github.com/repos/pydata/xarray/issues/2553,443446419,MDEyOklzc3VlQ29tbWVudDQ0MzQ0NjQxOQ==,1217238,2018-12-01T18:15:30Z,2018-12-01T18:15:30Z,MEMBER,"Itertoolz is a dask.array dependency that we didn't explicitly declare in
xarray (yes, this was sloppy). We should probably just copy the function we
use into some utils file inside xarray -- it has a compatible open source
license. Or we could rewrite to not need it.
On Sat, Dec 1, 2018 at 8:18 AM Tom Nicholas <notifications@github.com>
wrote:

> I've fixed the code so it works with python 3.5 & 2.7 (tested on my local
> machine), but the tests on python 2.7 are still failing because for some
> reason it can't find the itertoolz module. I tried to fix this by adding
> toolz an explicit dependency in setup.py but that didn't work either.
>
> However, given that xarray is supposed to be dropping python 2 support at
> the end of this year (), does this particularly matter?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/pydata/xarray/pull/2553#issuecomment-443437580>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABKS1gVaW7quLcUIo1l-l72xh363f7Utks5u0qvvgaJpZM4YX8wE>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229
https://github.com/pydata/xarray/pull/2553#issuecomment-437614965,https://api.github.com/repos/pydata/xarray/issues/2553,437614965,MDEyOklzc3VlQ29tbWVudDQzNzYxNDk2NQ==,1217238,2018-11-10T19:47:22Z,2018-11-10T19:47:22Z,MEMBER,"I should have suggested this before, but the internal implementation of `numpy.block()` is a nice example of this sort of logic: https://github.com/numpy/numpy/blob/e726c045c72833d5c826e8a13f889746ee0bfdf4/numpy/core/shape_base.py#L661","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379415229