html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/2375#issuecomment-417430252,https://api.github.com/repos/pydata/xarray/issues/2375,417430252,MDEyOklzc3VlQ29tbWVudDQxNzQzMDI1Mg==,4806877,2018-08-30T18:58:35Z,2018-08-30T18:58:35Z,CONTRIBUTOR,Great! Thanks so much for all the feedback :),"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415647225,https://api.github.com/repos/pydata/xarray/issues/2375,415647225,MDEyOklzc3VlQ29tbWVudDQxNTY0NzIyNQ==,1217238,2018-08-24T04:24:33Z,2018-08-24T04:24:33Z,MEMBER,"It might make sense to use a list instead of a set here. On Thu, Aug 23, 2018 at 8:37 PM Keisuke Fujii wrote: > *@fujiisoup* commented on this pull request. > > Thanks. A few comments. > ------------------------------ > > In xarray/core/dataset.py > : > > > + > + missing_dims = [dim for dim in dims if dim not in self.dims] > + if missing_dims: > + raise ValueError('Dataset does not contain the dimensions: %s' > + % missing_dims) > + > + non_multi_dims = [dim for dim in dims > + if not isinstance(self.get_index(dim), pd.MultiIndex)] > + if non_multi_dims and dim_from_kwarg: > + raise ValueError('cannot unstack dimensions that do not ' > + 'have a MultiIndex: %s' % non_multi_dims) > + > + dims = dims - set(non_multi_dims) > + if len(dims) == 0: > + raise ValueError('cannot unstack an object that does not have ' > + 'MultiIndex dimensions') > > I think that we can allow to unstack an object without MultiIndex, which > just returns as is. > It would be useful if users want to remove any MultiIndexes from an object. > ------------------------------ > > In xarray/core/dataset.py > : > > > + ------- > + unstacked : Dataset > + Dataset with unstacked data. > + > + See also > + -------- > + Dataset.stack > + """""" > + dim_from_kwarg = dim is not None > + > + if isinstance(dim, basestring): > + dims = set([dim]) > + elif dim is None: > + dims = set(self.dims) > + else: > + dims = set(dim) > > Maybe we can use OrderedSet instead of set so that the resultant > dimension order is fixed. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , > or mute the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415493314,https://api.github.com/repos/pydata/xarray/issues/2375,415493314,MDEyOklzc3VlQ29tbWVudDQxNTQ5MzMxNA==,4806877,2018-08-23T17:02:15Z,2018-08-23T17:02:15Z,CONTRIBUTOR,Thanks for the context!,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415492951,https://api.github.com/repos/pydata/xarray/issues/2375,415492951,MDEyOklzc3VlQ29tbWVudDQxNTQ5Mjk1MQ==,1217238,2018-08-23T17:01:23Z,2018-08-23T17:01:23Z,MEMBER,"Dataset.transpose accepts *args based on the design of numpy.ndarray.transpose, but that API is probably a mistake (both in NumPy and xarray). Everything else uses an axis/dim argument that can take a scalar or sequence value. On Thu, Aug 23, 2018 at 9:56 AM Julia Signell wrote: > I can change it. I guess I was looking at Dataset.transpose: > https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2498 > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415490818,https://api.github.com/repos/pydata/xarray/issues/2375,415490818,MDEyOklzc3VlQ29tbWVudDQxNTQ5MDgxOA==,4806877,2018-08-23T16:56:07Z,2018-08-23T16:56:07Z,CONTRIBUTOR,I can change it. I guess I was looking at `Dataset.transpose`: https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2498,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415486919,https://api.github.com/repos/pydata/xarray/issues/2375,415486919,MDEyOklzc3VlQ29tbWVudDQxNTQ4NjkxOQ==,1217238,2018-08-23T16:46:41Z,2018-08-23T16:46:41Z,MEMBER,"> I chose to use *dims rather than a list of dims so that this change will have a very small impact on people. Most people probably do something like unstack('z') right now, and that will still work. Usually we prefer to stick to a single argument, but use isinstance checks to support both single dimensions and lists of dimensions, e.g., see how `dim` is parsed in `Dataset.reduce`: https://github.com/pydata/xarray/blob/master/xarray/core/dataset.py#L2774-L2779","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415473341,https://api.github.com/repos/pydata/xarray/issues/2375,415473341,MDEyOklzc3VlQ29tbWVudDQxNTQ3MzM0MQ==,4806877,2018-08-23T16:06:51Z,2018-08-23T16:06:51Z,CONTRIBUTOR,"I chose to use *dims rather than a list of dims so that this change will have a very small impact on people. Most people probably do something like `unstack('z')` right now, and that will still work.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415437261,https://api.github.com/repos/pydata/xarray/issues/2375,415437261,MDEyOklzc3VlQ29tbWVudDQxNTQzNzI2MQ==,4806877,2018-08-23T14:29:01Z,2018-08-23T14:29:01Z,CONTRIBUTOR,"Ok so in this PR I will make unstack accept multiple dims like `xr.DataFrame.unstack(*dims)`. The order of the dims will only be roundtripped if all dims are stacked into one, but I think that is reasonable. In a follow on PR I will make `xarray.label_like(array, other)`. I think that notation speaks more to what we are trying to convey, but I do think the position of the arguments isn't intuitive. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415227870,https://api.github.com/repos/pydata/xarray/issues/2375,415227870,MDEyOklzc3VlQ29tbWVudDQxNTIyNzg3MA==,1217238,2018-08-23T00:06:56Z,2018-08-23T00:06:56Z,MEMBER,"I think `unstack()` unstacking all dimensions by default would make sense. > Should we be using xr.full_like in this way? I'm not really opposed to `full_like` working this way, but it does look a little strange to my eye. The ""full"" part of the name doesn't really make sense to me. I would usually suggest using the DataArray constructor here, e.g., `xr.DataArray(output_values, flat_input.coords, flat_input.dims, flat_inputs.attrs)`. Maybe we can figure a better way to spell ""label these arrays like this template xarray object"" that doesn't require referencing `flat_input` multiple times. Maybe `xarray.label_like(array, source)` or `source.with_data(array)`? > Would something like xr.unstack_like be desirable? I'm not sure that a dedicated function `unstack_like` would make sense for xarray. This is the sort of helper function that you can write yourself in a couple of lines.","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415201313,https://api.github.com/repos/pydata/xarray/issues/2375,415201313,MDEyOklzc3VlQ29tbWVudDQxNTIwMTMxMw==,6815844,2018-08-22T22:22:48Z,2018-08-22T22:22:48Z,MEMBER,"> But maybe it is better to choose the first dim that is MultiIndex rather than the first dim. *first* dimension is not well defined in `Dataset`, as it is a union of the dims of all the dataarrays it has. For example, in the following example, `ds.unstack()['var']` and `da['var'].unstack()` will give different results. ```python In [15]: import numpy as np ...: import xarray as xr ...: ...: ds = xr.Dataset({'var': (('x', 'y', 'z', 'w'), np.random.randn(2,3,4,5))}) ...: ds = ds.stack(b=['z', 'w']).stack(a=['x', 'y']) ...: ds ...: Out[15]: Dimensions: (a: 6, b: 20) Coordinates: * b (b) MultiIndex - z (b) int64 0 0 0 0 0 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 - w (b) int64 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 * a (a) MultiIndex - x (a) int64 0 0 0 1 1 1 - y (a) int64 0 1 2 0 1 2 Data variables: var (b, a) float64 -1.277 -0.4031 -0.3816 ... 1.398 0.6763 -0.6735 In [16]: list(ds.dims) Out[16]: ['a', 'b'] In [17]: list(ds['var'].dims) Out[17]: ['b', 'a'] ``` > but in that case should we allow passing in multiple dims? I like this direction. `stack` accepts multiple pairs of dimensions to be stacked, like `ds.stack(a=['x', 'y'], b=['z', 'w'])`. In this method, it repeatedly calls `_stack_once` method. I think `unstack` also can have the similar logic. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415099271,https://api.github.com/repos/pydata/xarray/issues/2375,415099271,MDEyOklzc3VlQ29tbWVudDQxNTA5OTI3MQ==,4806877,2018-08-22T16:44:47Z,2018-08-22T17:46:24Z,CONTRIBUTOR,">we have similar method reset_index. Do we also want to make dim optional? I don't have an opinion on that except to say that `reset_index` takes an iter of dims so it is at least slightly different. So to me it seems fine to only make dim optional on unstack.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-415053454,https://api.github.com/repos/pydata/xarray/issues/2375,415053454,MDEyOklzc3VlQ29tbWVudDQxNTA1MzQ1NA==,4806877,2018-08-22T14:32:51Z,2018-08-22T15:23:27Z,CONTRIBUTOR,"> what should be done if DataArray or Dataset has multiple MultiIndexes. Maybe do we unstack all the MultiIndexes? I like the idea of unstacking all the MultiIndexes, but in that case should we allow passing in multiple dims? It seems weird to do a recursive unstack in the case of no argument passed without allowing the user to specifically choose multiple dims along which to unstack. I think it is probably better to just choose a default dim to unstack like this PR does. But maybe it is better to choose the first dim that is MultiIndex rather than the first dim. That way if you do a stack().unstack() you will roundtrip your data since the stacked index gets added to the end of dims. And if you just pass an object with one MultiIndex (probably the most common scenario) unstack will do the right thing. And if you pass an object with multiple MultiIndexes and unstack repeatedly, you will get your original data out.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925 https://github.com/pydata/xarray/pull/2375#issuecomment-414860789,https://api.github.com/repos/pydata/xarray/issues/2375,414860789,MDEyOklzc3VlQ29tbWVudDQxNDg2MDc4OQ==,6815844,2018-08-22T00:01:45Z,2018-08-22T00:01:45Z,MEMBER,"Thanks, @jsignell. I like this idea (`unstack` without explicit dimension names), but I think we may need to decide what API would be the best. My particular concern is + what should be done if DataArray or Dataset has multiple MultiIndexes. Maybe do we unstack all the MultiIndexes? + we have similar method `reset_index`. Do we also want to make `dim` optional? For `unstack_like`, I'm not sure it is worth adding as a top level function as `xr.full_like(other, data).unstack()` is simple enoguh...","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,352677925