github: issue_comments: 9 rows where issue = 216215022 sorted by updated

9 rows where issue = 216215022 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
337970838	https://github.com/pydata/xarray/issues/1317#issuecomment-337970838	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDMzNzk3MDgzOA==	nbren12 1386642	2017-10-19T16:56:37Z	2017-10-19T16:56:37Z	CONTRIBUTOR	Sorry. I guess I should have made my last comment in the PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
337959059	https://github.com/pydata/xarray/issues/1317#issuecomment-337959059	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDMzNzk1OTA1OQ==	shoyer 1217238	2017-10-19T16:14:54Z	2017-10-19T16:14:54Z	MEMBER	IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. :+1: for a function or class based interface if that makes sense. Can you share a few examples of what using your proposed API would look like?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
337796691	https://github.com/pydata/xarray/issues/1317#issuecomment-337796691	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDMzNzc5NjY5MQ==	nbren12 1386642	2017-10-19T04:32:03Z	2017-10-19T04:32:03Z	CONTRIBUTOR	After using my own version of this code for the past month or so, it has occurred to me that this API probably will not support stacking arrays of with different sizes along shared arrays. For instance, I need to "stack" humidity below an altitude of 10km with temperature between 0 and 16 km. IMO, the easiest way to do this would be to change these methods into top-level functions which can take any dict or iterable of datarrays. We could leave that for a later PR of course.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
332623355	https://github.com/pydata/xarray/issues/1317#issuecomment-332623355	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDMzMjYyMzM1NQ==	jhamman 2443309	2017-09-27T19:03:14Z	2017-09-27T19:03:14Z	MEMBER	I can see the use of a Dataset to_array/stack method that does not broadcast arrays. Feel free to open a PR and we'll take a look.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
330282841	https://github.com/pydata/xarray/issues/1317#issuecomment-330282841	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDMzMDI4Mjg0MQ==	nbren12 1386642	2017-09-18T16:45:55Z	2017-09-18T16:46:37Z	CONTRIBUTOR	@shoyer I wrote a class that does this a while ago. It is available here: data_matrix.py. It is used like this ```python D is a dataset the signature for DataMatrix.init is DataMatrix(feature_dims, sample_dims, variables) mat = DataMatrix(['z'], ['x'], ['a', 'b']) y = mat.dataset_to_mat(D) x = mat.mat_to_dataset(y) `` One of the problems I had to handle was with concatenating/stacking DataArrays with different numbers of dimensions---stack`and`unstack`combined with`to_array`can only handle the case where the desired feature variables all have the same dimensionality. ATM my code stacks the desired dimensions for each variable and then manually calls`np.hstack` to produce the final matrix, but I bet it would be easy to create a pandas Index object which can handle this use case. Would you be open to a PR along these lines?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
288607926	https://github.com/pydata/xarray/issues/1317#issuecomment-288607926	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDI4ODYwNzkyNg==	nbren12 1386642	2017-03-23T03:32:50Z	2017-03-23T03:40:22Z	CONTRIBUTOR	I had the chance to play around with `stack` and `unstack`, and it appears that these actually do nearly all the work needed here, so you can disregard my last comment. The only logic which is somewhat unwieldy is code which creates a DataArray from the `eofs` dask array. This is a complete example using the air dataset: ```python air = load_dataset("air_temperature").air A = air.stack(features=['lat', 'lon']).chunk() A-= A.mean('features') ,,eofs = svd_compressed(A.data, 4) wrap eofs in dataarray dims = ['modes', 'features'] coords = {} for i, dim in enumerate(dims): if dim in A.dims: coords[dim] = A[dim] elif dim in coords: pass else: coords[dim] = np.arange(eofs.shape[i]) eofs = xr.DataArray(eofs, dims=dims, coords=coords).unstack('features') `` This is pretty compact as is, so maybe the ugly final bit could be replaced with a convenience function likeunstack_array(eofs, dims, coords)`or a method call`A.unstack_array(eofs, dims, new_coords={})`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
288590846	https://github.com/pydata/xarray/issues/1317#issuecomment-288590846	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDI4ODU5MDg0Ng==	nbren12 1386642	2017-03-23T01:32:55Z	2017-03-23T01:32:55Z	CONTRIBUTOR	Cool! Thanks for that link. As far as the API is concerned, I think I like the `ReshapeCoder` approach a little better because it does not require keeping track of a `feature_dims` vector list throughout the code, like my class does. It also could generalize beyond just creating a 2D array. To produce a dataset `B(samples,features)` from a dataset `A(x,y,z,t)` how do you feel about a syntax like this: ```python rs = Reshaper(dict(samples=('t',), features=('x', 'y', 'z')), coords=A.coords) B = rs.encode(A) ,,eofs =svd(B.data) eofs is now a 2D dask array so we need to give it dimension information eof_dims = ['mode', 'features'] rs.decode(eofs, eof_dims) to decode XArray object we don't need to pass dimension info rs.decode(B) ``` On the other hand, it would be nice to be able to reshape data through a syntax like `A.reshape.encode(dict(...))`	{ "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
288577529	https://github.com/pydata/xarray/issues/1317#issuecomment-288577529	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDI4ODU3NzUyOQ==	shoyer 1217238	2017-03-23T00:06:34Z	2017-03-23T00:06:34Z	MEMBER	I've written similar code in the past as well, so I would be pretty supportive of adding a utility class for this. Actually one of my colleagues wrote a virtually identical class for our xarray equivalent in TensorFlow -- take a look at it for some possible alternative API options. For xarray, `.stack()` and `.to_array()`, or `.to_dataframe()` can do most of the heavy lifting instead of manually reshaping. Thanks for the pointer to xlearn, too!	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022
288549282	https://github.com/pydata/xarray/issues/1317#issuecomment-288549282	https://api.github.com/repos/pydata/xarray/issues/1317	MDEyOklzc3VlQ29tbWVudDI4ODU0OTI4Mg==	fmaussion 10050469	2017-03-22T21:43:12Z	2017-03-22T21:43:12Z	MEMBER	I personally have no opinion on the subject, but maybe @ajdawson wants to chime in (as the author of the eofs package which includes xarray support).	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	API for reshaping DataArrays as 2D "data matrices" for use in machine learning 216215022

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

9 rows where issue = 216215022 sorted by updated_at descending

D is a dataset

the signature for DataMatrix.init is

DataMatrix(feature_dims, sample_dims, variables)

wrap eofs in dataarray

eofs is now a 2D dask array so we need to give

it dimension information

to decode XArray object we don't need to pass dimension info

Advanced export