html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1317#issuecomment-337970838,https://api.github.com/repos/pydata/xarray/issues/1317,337970838,MDEyOklzc3VlQ29tbWVudDMzNzk3MDgzOA==,1386642,2017-10-19T16:56:37Z,2017-10-19T16:56:37Z,CONTRIBUTOR,Sorry. I guess I should have made my last comment in the PR. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-337959059,https://api.github.com/repos/pydata/xarray/issues/1317,337959059,MDEyOklzc3VlQ29tbWVudDMzNzk1OTA1OQ==,1217238,2017-10-19T16:14:54Z,2017-10-19T16:14:54Z,MEMBER,"> IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. :+1: for a function or class based interface if that makes sense. Can you share a few examples of what using your proposed API would look like?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-337796691,https://api.github.com/repos/pydata/xarray/issues/1317,337796691,MDEyOklzc3VlQ29tbWVudDMzNzc5NjY5MQ==,1386642,2017-10-19T04:32:03Z,2017-10-19T04:32:03Z,CONTRIBUTOR,"After using my own version of this code for the past month or so, it has occurred to me that this API probably will not support stacking arrays of with different sizes along shared arrays. For instance, I need to ""stack"" humidity below an altitude of 10km with temperature between 0 and 16 km. IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. We could leave that for a later PR of course.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-332623355,https://api.github.com/repos/pydata/xarray/issues/1317,332623355,MDEyOklzc3VlQ29tbWVudDMzMjYyMzM1NQ==,2443309,2017-09-27T19:03:14Z,2017-09-27T19:03:14Z,MEMBER,I can see the use of a Dataset to_array/stack method that does not broadcast arrays. Feel free to open a PR and we'll take a look.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-330282841,https://api.github.com/repos/pydata/xarray/issues/1317,330282841,MDEyOklzc3VlQ29tbWVudDMzMDI4Mjg0MQ==,1386642,2017-09-18T16:45:55Z,2017-09-18T16:46:37Z,CONTRIBUTOR,"@shoyer I wrote a class that does this a while ago. It is available here: [data_matrix.py](https://github.com/nbren12/gnl/blob/master/gnl/). It is used like this ```python # D is a dataset # the signature for DataMatrix.__init__ is # DataMatrix(feature_dims, sample_dims, variables) mat = DataMatrix(['z'], ['x'], ['a', 'b']) y = mat.dataset_to_mat(D) x = mat.mat_to_dataset(y) ``` One of the problems I had to handle was with concatenating/stacking DataArrays with different numbers of dimensions---`stack` and `unstack` combined with `to_array` can only handle the case where the desired feature variables all have the same dimensionality. ATM my code stacks the desired dimensions for each variable and then manually calls `np.hstack` to produce the final matrix, but I bet it would be easy to create a pandas Index object which can handle this use case. Would you be open to a PR along these lines?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-288607926,https://api.github.com/repos/pydata/xarray/issues/1317,288607926,MDEyOklzc3VlQ29tbWVudDI4ODYwNzkyNg==,1386642,2017-03-23T03:32:50Z,2017-03-23T03:40:22Z,CONTRIBUTOR,"I had the chance to play around with `stack` and `unstack`, and it appears that these actually do nearly all the work needed here, so you can disregard my last comment. The only logic which is somewhat unwieldy is code which creates a DataArray from the `eofs` dask array. This is a complete example using the air dataset: ```python air = load_dataset(""air_temperature"").air A = air.stack(features=['lat', 'lon']).chunk() A-= A.mean('features') _,_,eofs = svd_compressed(A.data, 4) # wrap eofs in dataarray dims = ['modes', 'features'] coords = {} for i, dim in enumerate(dims): if dim in A.dims: coords[dim] = A[dim] elif dim in coords: pass else: coords[dim] = np.arange(eofs.shape[i]) eofs = xr.DataArray(eofs, dims=dims, coords=coords).unstack('features') ``` This is pretty compact as is, so maybe the ugly final bit could be replaced with a convenience function like `unstack_array(eofs, dims, coords)` or a method call `A.unstack_array(eofs, dims, new_coords={})`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-288590846,https://api.github.com/repos/pydata/xarray/issues/1317,288590846,MDEyOklzc3VlQ29tbWVudDI4ODU5MDg0Ng==,1386642,2017-03-23T01:32:55Z,2017-03-23T01:32:55Z,CONTRIBUTOR,"Cool! Thanks for that link. As far as the API is concerned, I think I like the `ReshapeCoder` approach a little better because it does not require keeping track of a `feature_dims` vector list throughout the code, like my class does. It also could generalize beyond just creating a 2D array. To produce a dataset `B(samples,features)` from a dataset `A(x,y,z,t)` how do you feel about a syntax like this: ```python rs = Reshaper(dict(samples=('t',), features=('x', 'y', 'z')), coords=A.coords) B = rs.encode(A) _,_,eofs =svd(B.data) # eofs is now a 2D dask array so we need to give # it dimension information eof_dims = ['mode', 'features'] rs.decode(eofs, eof_dims) # to decode XArray object we don't need to pass dimension info rs.decode(B) ``` On the other hand, it would be nice to be able to reshape data through a syntax like A.reshape.encode(dict(...)) ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-288577529,https://api.github.com/repos/pydata/xarray/issues/1317,288577529,MDEyOklzc3VlQ29tbWVudDI4ODU3NzUyOQ==,1217238,2017-03-23T00:06:34Z,2017-03-23T00:06:34Z,MEMBER,"I've written similar code in the past as well, so I would be pretty supportive of adding a utility class for this. Actually one of my colleagues wrote a [virtually identical class](https://github.com/tensorflow/tensorflow/blob/4e18625c55afdbe50e922a70a12df05320e387e0/tensorflow/contrib/labeled_tensor/python/ops/sugar.py#L29) for our xarray equivalent in TensorFlow -- take a look at it for some possible alternative API options. For xarray, `.stack()` and `.to_array()`, or `.to_dataframe()` can do most of the heavy lifting instead of manually reshaping. Thanks for the pointer to xlearn, too! ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022 https://github.com/pydata/xarray/issues/1317#issuecomment-288549282,https://api.github.com/repos/pydata/xarray/issues/1317,288549282,MDEyOklzc3VlQ29tbWVudDI4ODU0OTI4Mg==,10050469,2017-03-22T21:43:12Z,2017-03-22T21:43:12Z,MEMBER,"I personally have no opinion on the subject, but maybe @ajdawson wants to chime in (as the author of the [eofs](https://github.com/ajdawson/eofs) package which includes xarray support).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022