home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

9 rows where issue = 568968607 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • max-sixty 4
  • takluyver 3
  • dcherian 1
  • crusaderky 1

issue 1

  • DataArray.unstack() leaving dimensions 'in order' · 9 ✖

author_association 1

  • MEMBER 9
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
590024174 https://github.com/pydata/xarray/issues/3786#issuecomment-590024174 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU5MDAyNDE3NA== dcherian 2448579 2020-02-23T04:03:06Z 2020-02-23T04:03:06Z MEMBER

This sounds useful to me

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
590012159 https://github.com/pydata/xarray/issues/3786#issuecomment-590012159 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU5MDAxMjE1OQ== max-sixty 5635139 2020-02-23T00:18:34Z 2020-02-23T00:18:34Z MEMBER

I think it should be pretty reasonable to do that in a few lines, for numpy-backed arrays at least! Get the underlying numpy array, sort the .strides to get the memory order, map to dimension names, and pass them to transpose?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
590001709 https://github.com/pydata/xarray/issues/3786#issuecomment-590001709 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU5MDAwMTcwOQ== takluyver 327925 2020-02-22T21:50:46Z 2020-02-22T21:50:46Z MEMBER

If it's easier, a method for "transpose these dimensions to the order which makes the data contiguous" would meet my needs.

I'm happy in principle to work on either feature, but when I've looked into contributing to xarray before, I've been a bit overwhelmed by complexity - I think I'm currently using a pretty small fragment of what it can do.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589985055 https://github.com/pydata/xarray/issues/3786#issuecomment-589985055 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTk4NTA1NQ== max-sixty 5635139 2020-02-22T18:31:56Z 2020-02-22T18:31:56Z MEMBER

I think I see your point @takluyver — that by returning the new axes at the end it's both surprising and no longer a contiguous array.

But then +1 re your final point about unstack being unable to return a view on missing data.

I don't think it's impossible to carefully manage the dimensions order & contiguous-ness in this case. It may take someone who can champion it, though

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589956473 https://github.com/pydata/xarray/issues/3786#issuecomment-589956473 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTk1NjQ3Mw== takluyver 327925 2020-02-22T13:30:40Z 2020-02-22T13:30:40Z MEMBER

@max-sixty - I certainly want to avoid copying the data unless it's necessary. But I'd like to present the axes in the 'real' memory order, from the largest stride to the smallest.

I appreciate that it shouldn't matter for program logic, but it can definitely matter for performance, and I know some users are going to use axes by position rather than by name, so I do consider it an important part of the API.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589762619 https://github.com/pydata/xarray/issues/3786#issuecomment-589762619 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTc2MjYxOQ== max-sixty 5635139 2020-02-21T17:53:51Z 2020-02-21T17:53:51Z MEMBER

So I'm understanding correctly: does numpy's behavior even satisfy your reqs? https://docs.scipy.org/doc/numpy/reference/generated/numpy.swapaxes.html

then a view of a is returned

i.e. the array's underlying layout doesn't change, only the view...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589708365 https://github.com/pydata/xarray/issues/3786#issuecomment-589708365 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTcwODM2NQ== crusaderky 6213168 2020-02-21T15:42:36Z 2020-02-21T15:42:36Z MEMBER

This also means that either the new array is no longer C-contiguous, or the .unstack() operation has had to copy all the data to rearrange it.

The former. As a core design principle, xarray does not care about dimensions order, and any user code that implicitly relies on it should be considered a bad design. The .transpose() method mostly only exists for when people need to access the numpy .data object directly with a numpy function.

{
    "total_count": 2,
    "+1": 2,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589707231 https://github.com/pydata/xarray/issues/3786#issuecomment-589707231 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTcwNzIzMQ== takluyver 327925 2020-02-21T15:40:04Z 2020-02-21T15:40:04Z MEMBER

Thanks, it makes sense that dimension order conceptually doesn't matter so much once they're labelled. Though I'd say that as Xarray has public APIs for accessing dimensions by position, changing how something like .unstack() orders them probably is a breaking change.

I'm providing an API for other people to access data as a DataArray, and I'd like to have the dimensions in the order that makes it C contiguous, as a hint about what kinds of operations will be efficient. I know some users also go 'what is this complicated thing?' and extract the numpy array from it, in which case the order is more important.

So another option that would work for me would be a separate method that rearranges the dimensions to the order that makes the array C contiguous.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607
589687492 https://github.com/pydata/xarray/issues/3786#issuecomment-589687492 https://api.github.com/repos/pydata/xarray/issues/3786 MDEyOklzc3VlQ29tbWVudDU4OTY4NzQ5Mg== max-sixty 5635139 2020-02-21T14:55:05Z 2020-02-21T14:55:05Z MEMBER

Thanks for the issue @takluyver

I mostly think (others should weigh in) that we don't spend much effort on dimension order. (Would a dimension order change always be a regression?). That's mostly because of one of xarray's great strengths: you don't need to worry about the order, because you can reference dimensions by name.

I imagine that unstack dimension order wasn't carefully chosen to be at the end vs in place; and we'd take a PR to make that change.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  DataArray.unstack() leaving dimensions 'in order' 568968607

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 636.496ms · About: xarray-datasette