issue_comments
14 rows where issue = 166439490 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- unstack() sorts data alphabetically · 14 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
457183389 | https://github.com/pydata/xarray/issues/906#issuecomment-457183389 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDQ1NzE4MzM4OQ== | stale[bot] 26384082 | 2019-01-24T12:43:22Z | 2019-01-24T12:43:22Z | NONE | In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity If this issue remains relevant, please comment here; otherwise it will be marked as closed automatically |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
269507466 | https://github.com/pydata/xarray/issues/906#issuecomment-269507466 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDI2OTUwNzQ2Ng== | shoyer 1217238 | 2016-12-28T17:09:23Z | 2016-12-28T17:09:23Z | MEMBER | @crusaderky can you raise the issue again on the pandas issue tracker (see my comment in https://github.com/pandas-dev/pandas/issues/14903#issuecomment-267779151)? If need be, we can change this separately, but all things being equal I would prefer to keep |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
269479071 | https://github.com/pydata/xarray/issues/906#issuecomment-269479071 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDI2OTQ3OTA3MQ== | crusaderky 6213168 | 2016-12-28T13:46:19Z | 2016-12-28T13:46:19Z | MEMBER | @shoyer, are you happy for me to go ahead and change unstack() to respect the order of the first found series? |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
234687071 | https://github.com/pydata/xarray/issues/906#issuecomment-234687071 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzNDY4NzA3MQ== | crusaderky 6213168 | 2016-07-23T00:27:49Z | 2016-07-23T00:27:49Z | MEMBER | Thanks, didn't know https://gist.github.com/crusaderky/002ba64ee270164931d32ea3366dce1f |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
234686759 | https://github.com/pydata/xarray/issues/906#issuecomment-234686759 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzNDY4Njc1OQ== | shoyer 1217238 | 2016-07-23T00:24:17Z | 2016-07-23T00:24:17Z | MEMBER | @crusaderky gist.github.com will render ipynb files, which makes them much easier to view! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
234686438 | https://github.com/pydata/xarray/issues/906#issuecomment-234686438 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzNDY4NjQzOA== | crusaderky 6213168 | 2016-07-23T00:20:41Z | 2016-07-23T00:20:41Z | MEMBER | Fixed in attachment. The code uses the first found series as the order. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
234004910 | https://github.com/pydata/xarray/issues/906#issuecomment-234004910 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzNDAwNDkxMA== | crusaderky 6213168 | 2016-07-20T16:33:15Z | 2016-07-20T16:33:15Z | MEMBER | I see. I'll see if I can think a good way to cope with your two examples. BTW, my code above is buggy as it blindly assumes that the first dim is also the outermost. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233994941 | https://github.com/pydata/xarray/issues/906#issuecomment-233994941 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzk5NDk0MQ== | shoyer 1217238 | 2016-07-20T15:58:15Z | 2016-07-20T15:58:15Z | MEMBER | Here are two examples where we would need to do pick-by-index on the data no matter what:
There is no order for one or more of the levels would be sorted:
Even more pathological: the multi-index doesn't even fill out every value in the cartesian product:
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233904555 | https://github.com/pydata/xarray/issues/906#issuecomment-233904555 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzkwNDU1NQ== | crusaderky 6213168 | 2016-07-20T09:52:42Z | 2016-07-20T09:52:42Z | MEMBER | This preamble should be integrated inside unstack(): ``` python import operator from functools import reduce def proper_unstack(array, dim):
proper_unstack(a, 'dim_0') ```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233888081 | https://github.com/pydata/xarray/issues/906#issuecomment-233888081 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzg4ODA4MQ== | crusaderky 6213168 | 2016-07-20T08:42:19Z | 2016-07-20T08:42:19Z | MEMBER | the order of appearance should be what dictates the output.
Not true. Using the order of appearance requires you to do a pick-by-index on the index. At the moment, you're doing a pick-by-index on the data. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233797167 | https://github.com/pydata/xarray/issues/906#issuecomment-233797167 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzc5NzE2Nw== | shoyer 1217238 | 2016-07-19T23:29:57Z | 2016-07-19T23:29:57Z | MEMBER |
This is true, but in the worst case (e.g., random order for the MultiIndex) we'll have this issue no matter what rule we pick for assigning unstacked coordinates.
MultiIndex should work with dask -- we have a few tests for this. If not, a bug report would be appreciated! |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233796557 | https://github.com/pydata/xarray/issues/906#issuecomment-233796557 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzc5NjU1Nw== | shoyer 1217238 | 2016-07-19T23:26:33Z | 2016-07-19T23:26:33Z | MEMBER | What behavior would you suggest as an alternative? I suppose that in principle we could assign new levels based on order of appearance (and treat ```
|
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233794061 | https://github.com/pydata/xarray/issues/906#issuecomment-233794061 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzc5NDA2MQ== | crusaderky 6213168 | 2016-07-19T23:11:57Z | 2016-07-19T23:11:57Z | MEMBER | this workaround works:
However, I think that the whole thing is incredibly convoluted. Namely, because everything looks good both if you visualize the original pandas Series/DataFrame, as well as the stacked DataArray. unstack() is causing an internal technicality of pandas to produce real change in the data. I came through this issue because I am using pandas to load a multi-index CSV from disk, and then convert it to a n-dimensional xarray. In this situation, I have no control over the multiindex - short of manually rebuilding it after the CSV load. The pandas dataframe looks right, the stacked xarray looks right, the unstacked xarray gets magically sorted :$ Also I don't understand why you say there's no performance implications. You're basically doing a pick-by-index rebuild of the array, which does potentially random access to the whole input array - thus nullifying the benefits of the CPU cache. This is compared to a numpy.ndarray.reshape(), which has the cost of a memcpy(). I was going to add something about doing pick-by-index with a dask array will be even worse, when I realised that multiindex does not work at all when you chunk()... :( |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 | |
233776163 | https://github.com/pydata/xarray/issues/906#issuecomment-233776163 | https://api.github.com/repos/pydata/xarray/issues/906 | MDEyOklzc3VlQ29tbWVudDIzMzc3NjE2Mw== | shoyer 1217238 | 2016-07-19T21:45:33Z | 2016-07-19T21:45:33Z | MEMBER |
By default, pandas.MultiIndex creates each level in |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
unstack() sorts data alphabetically 166439490 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 3