github: issue_comments: 9 rows where author_association = "MEMBER" and issue = 245624267 sorted by updated

9 rows where author_association = "MEMBER" and issue = 245624267 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
340125534	https://github.com/pydata/xarray/pull/1489#issuecomment-340125534	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDM0MDEyNTUzNA==	shoyer 1217238	2017-10-28T00:21:48Z	2017-10-28T00:21:48Z	MEMBER	@jmunroe Thanks for your help here! I'm going to merge this now and take care of my remaining clean-up requests in a follow-on PR.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
339894999	https://github.com/pydata/xarray/pull/1489#issuecomment-339894999	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMzOTg5NDk5OQ==	shoyer 1217238	2017-10-27T07:28:02Z	2017-10-27T07:28:02Z	MEMBER	Just pushed a couple of commits, which should resolve the failures on Windows. It was typical int32 vs int64 NumPy on Windows nonsense.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
338424196	https://github.com/pydata/xarray/pull/1489#issuecomment-338424196	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMzODQyNDE5Ng==	shoyer 1217238	2017-10-21T18:49:57Z	2017-10-21T18:49:57Z	MEMBER	@mrocklin are you saying that it's easier to properly rechunk data on the xarray side (as arrays) before converting to dask dataframes? That does make sense -- we have some nice structure (as multi-dimensional arrays) that is lost once the data gets put in a DataFrame. In this case, I suppose we really should add a keyword argument like `dims_order` to `to_dask_dataframe()` that lets the user choose how they want to order dimensions on the result. Initially, I was concerned about the resulting dask graphs when flattening out arrays in the wrong order. Although that would have bad performance implications if you need to stream the data from disk, I see now the total number of chunks no longer blows up, thanks to @pitrou's impressive rewrite of `dask.array.reshape()`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
338392039	https://github.com/pydata/xarray/pull/1489#issuecomment-338392039	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMzODM5MjAzOQ==	mrocklin 306380	2017-10-21T12:47:34Z	2017-10-21T12:47:34Z	MEMBER	I think that you would want to rechunk the dask.array so that its chunks align with the outputs divisions of the dask.dataframe. For example if you have a 2d array and are partitioning along the x-axis then you will want to align the array so that there is no chunking along the y axis. In this case `set_index` will also be free because your data is already aligned and you already know (I think) the division values.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
338368158	https://github.com/pydata/xarray/pull/1489#issuecomment-338368158	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMzODM2ODE1OA==	shoyer 1217238	2017-10-21T06:33:27Z	2017-10-21T06:33:27Z	MEMBER	@jcrist @mrocklin @jhamman do any of you have opinions on my latest design question above about the order of elements in dask dataframes? Is it as important as I suspect to keep chunking/divisions consistent when converting from arrays to dataframes?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
335307599	https://github.com/pydata/xarray/pull/1489#issuecomment-335307599	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMzNTMwNzU5OQ==	jhamman 2443309	2017-10-09T22:29:45Z	2017-10-09T22:29:45Z	MEMBER	@jmunroe - can we help move this forward? I'd like to see this get into v0.10 if possible.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
327300551	https://github.com/pydata/xarray/pull/1489#issuecomment-327300551	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMyNzMwMDU1MQ==	jhamman 2443309	2017-09-05T20:55:25Z	2017-09-05T20:55:25Z	MEMBER	@jmunroe - I added the PR checklist back to the top of this issue. The most pressing to-do item is getting some documentation written for this. The method will need to be added to `api.rst` We need a note briefly describing this feature in `whats-new.rst` We'll want to show an example of how this method can be used (either in the working with pandas or the dask doc sections)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
321461998	https://github.com/pydata/xarray/pull/1489#issuecomment-321461998	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMyMTQ2MTk5OA==	shoyer 1217238	2017-08-10T06:22:02Z	2017-08-10T06:22:02Z	MEMBER	@jmunroe This is great functionality -- thanks for your work on this! One concern: if possible, I would like to avoid adding explicit dask graph building code in xarray. It looks like the canonical way to transform from a list of dask/numpy arrays to a dask dataframe is to make use of `dask.dataframe.from_array` along with `dask.dataframe.concat`: ``` In [34]: import numpy as np In [35]: import dask.dataframe as dd In [36]: import dask.array as da In [37]: x = da.from_array(np.arange(5), 2) In [38]: y = da.from_array(np.linspace(-np.pi, np.pi, 5), 2) notice that dtype is preserved properly In [39]: dd.concat([dd.from_array(x), dd.from_array(y)], axis=1) Out[39]: Dask DataFrame Structure: 0 1 npartitions=2 0 int64 float64 2 ... ... 4 ... ... Dask Name: concat-indexed, 26 tasks ``` Can you look into refactoring your code to make use of these?	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267
318244646	https://github.com/pydata/xarray/pull/1489#issuecomment-318244646	https://api.github.com/repos/pydata/xarray/issues/1489	MDEyOklzc3VlQ29tbWVudDMxODI0NDY0Ng==	shoyer 1217238	2017-07-27T02:58:35Z	2017-07-27T02:58:35Z	MEMBER	Given that dask dataframes don't support MultiIndexes (among many other features), I have a hard time seeing them as a drop-in replacement for `pandas.DataFrame`. So maybe it would make sense to make this a separate method, e.g., `to_dask_dataframe()`? We could also use a new method as an opportunity to slightly change the API, by not setting an index automatically. This lets us handle N-dimensional data while side-stepping the issue of MultiIndex support -- I don't think this would be very useful when limited to 1D arrays, and dask MultiIndex support seems to be a ways away (https://github.com/dask/dask/issues/1493). Also, `set_index()` in dask shuffles data, so it can be somewhat expensive.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	lazily load dask arrays to dask data frames by calling to_dask_dataframe 245624267

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

9 rows where author_association = "MEMBER" and issue = 245624267 sorted by updated_at descending

notice that dtype is preserved properly

Advanced export