html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1317#issuecomment-337970838,https://api.github.com/repos/pydata/xarray/issues/1317,337970838,MDEyOklzc3VlQ29tbWVudDMzNzk3MDgzOA==,1386642,2017-10-19T16:56:37Z,2017-10-19T16:56:37Z,CONTRIBUTOR,Sorry. I guess I should have made my last comment in the PR. ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022
https://github.com/pydata/xarray/issues/1317#issuecomment-337796691,https://api.github.com/repos/pydata/xarray/issues/1317,337796691,MDEyOklzc3VlQ29tbWVudDMzNzc5NjY5MQ==,1386642,2017-10-19T04:32:03Z,2017-10-19T04:32:03Z,CONTRIBUTOR,"After using my own version of this code for the past month or so, it has occurred to me that this API probably will not support stacking arrays of with different sizes along shared arrays. For instance, I need to ""stack"" humidity below an altitude of 10km with temperature between 0 and 16 km. IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. We could leave that for a later PR of course.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022
https://github.com/pydata/xarray/issues/1317#issuecomment-330282841,https://api.github.com/repos/pydata/xarray/issues/1317,330282841,MDEyOklzc3VlQ29tbWVudDMzMDI4Mjg0MQ==,1386642,2017-09-18T16:45:55Z,2017-09-18T16:46:37Z,CONTRIBUTOR,"@shoyer I wrote a class that does this a while ago.
It is available here: [data_matrix.py](https://github.com/nbren12/gnl/blob/master/gnl/). It is used like this
```python
# D is a dataset
# the signature for DataMatrix.__init__ is
# DataMatrix(feature_dims, sample_dims, variables)
mat = DataMatrix(['z'], ['x'], ['a', 'b'])
y = mat.dataset_to_mat(D)
x = mat.mat_to_dataset(y)
```
One of the problems I had to handle was with concatenating/stacking DataArrays with different numbers of dimensions---`stack` and `unstack` combined with `to_array` can only handle the case where the desired feature variables all have the same dimensionality. ATM my code stacks the desired dimensions for each variable and then manually calls `np.hstack` to produce the final matrix, but I bet it would be easy to create a pandas Index object which can handle this use case.
Would you be open to a PR along these lines?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022
https://github.com/pydata/xarray/issues/1317#issuecomment-288607926,https://api.github.com/repos/pydata/xarray/issues/1317,288607926,MDEyOklzc3VlQ29tbWVudDI4ODYwNzkyNg==,1386642,2017-03-23T03:32:50Z,2017-03-23T03:40:22Z,CONTRIBUTOR,"I had the chance to play around with `stack` and `unstack`, and it appears that these actually do nearly all the work needed here, so you can disregard my last comment. The only logic which is somewhat unwieldy is code which creates a DataArray from the `eofs` dask array. This is a complete example using the air dataset:
```python
air = load_dataset(""air_temperature"").air
A = air.stack(features=['lat', 'lon']).chunk()
A-= A.mean('features')
_,_,eofs = svd_compressed(A.data, 4)
# wrap eofs in dataarray
dims = ['modes', 'features']
coords = {}
for i, dim in enumerate(dims):
if dim in A.dims:
coords[dim] = A[dim]
elif dim in coords:
pass
else:
coords[dim] = np.arange(eofs.shape[i])
eofs = xr.DataArray(eofs, dims=dims, coords=coords).unstack('features')
```
This is pretty compact as is, so maybe the ugly final bit could be replaced with a convenience function like `unstack_array(eofs, dims, coords)` or a method call `A.unstack_array(eofs, dims, new_coords={})`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022
https://github.com/pydata/xarray/issues/1317#issuecomment-288590846,https://api.github.com/repos/pydata/xarray/issues/1317,288590846,MDEyOklzc3VlQ29tbWVudDI4ODU5MDg0Ng==,1386642,2017-03-23T01:32:55Z,2017-03-23T01:32:55Z,CONTRIBUTOR,"Cool! Thanks for that link. As far as the API is concerned, I think I like the `ReshapeCoder` approach a little better because it does not require keeping track of a `feature_dims` vector list throughout the code, like my class does. It also could generalize beyond just creating a 2D array.
To produce a dataset `B(samples,features)` from a dataset `A(x,y,z,t)` how do you feel about a syntax like this:
```python
rs = Reshaper(dict(samples=('t',), features=('x', 'y', 'z')), coords=A.coords)
B = rs.encode(A)
_,_,eofs =svd(B.data)
# eofs is now a 2D dask array so we need to give
# it dimension information
eof_dims = ['mode', 'features']
rs.decode(eofs, eof_dims)
# to decode XArray object we don't need to pass dimension info
rs.decode(B)
```
On the other hand, it would be nice to be able to reshape data through a syntax like
A.reshape.encode(dict(...))
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,216215022