home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

8 rows where author_association = "CONTRIBUTOR" and issue = 842940980 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • ahuang11 8

issue 1

  • Add drop duplicates · 8 ✖

author_association 1

  • CONTRIBUTOR · 8 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
830508669 https://github.com/pydata/xarray/pull/5089#issuecomment-830508669 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgzMDUwODY2OQ== ahuang11 15331990 2021-05-01T03:25:47Z 2021-05-01T03:25:47Z CONTRIBUTOR

I failed to commit properly so see https://github.com/pydata/xarray/pull/5239 where I only do drop duplicates for dims

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
830274959 https://github.com/pydata/xarray/pull/5089#issuecomment-830274959 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgzMDI3NDk1OQ== ahuang11 15331990 2021-04-30T18:16:59Z 2021-04-30T18:17:21Z CONTRIBUTOR

I can take a look this weekend. If narrow, could simply rollback to this commit, make minor adjustments and merge. https://github.com/pydata/xarray/pull/5089/commits/28aa96ab13db72bfa6ad8b156c2c720b49ec9a04

But I personally prefer full so it'd be nice if we could come to a consensus on how to handle it~

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
822094033 https://github.com/pydata/xarray/pull/5089#issuecomment-822094033 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgyMjA5NDAzMw== ahuang11 15331990 2021-04-19T00:19:17Z 2021-04-19T00:19:17Z CONTRIBUTOR

@ahuang11 IIUC, this is only using .stack where it needs to actually stack the array, is that correct? So a list of dims is passed (rather than non-dim coords), then it's not stacking.

I agree with @shoyer that we could do it in a single isel in the basic case. One option is to have a fast path for non-dim coords only, and call isel once with those.

Yes correct. I am not feeling well at the moment so I probably won't get to this today, but feel free to make commits!

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
813782823 https://github.com/pydata/xarray/pull/5089#issuecomment-813782823 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgxMzc4MjgyMw== ahuang11 15331990 2021-04-06T02:48:00Z 2021-04-06T02:48:00Z CONTRIBUTOR

Not sure if there's a more elegant way of implementing this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
813179511 https://github.com/pydata/xarray/pull/5089#issuecomment-813179511 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgxMzE3OTUxMQ== ahuang11 15331990 2021-04-05T04:48:09Z 2021-04-05T04:48:09Z CONTRIBUTOR

Oh I just saw the edits with keeping the dims. I guess that would work.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
813169922 https://github.com/pydata/xarray/pull/5089#issuecomment-813169922 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgxMzE2OTkyMg== ahuang11 15331990 2021-04-05T04:09:26Z 2021-04-05T04:09:26Z CONTRIBUTOR

I prefer drop duplicate values to be under the unique() PR; maybe could be renamed as drop_duplicate_values().

Also I think preserving existing dimensions is more powerful than flattening the dimensions.

On Sun, Apr 4, 2021, 11:01 PM Stephan Hoyer @.***> wrote:

From an API perspective, I think the name drop_duplicates() would be fine. I would guess that handling arbitrary variables in a Dataset would not be any harder than handling only coordinates?

One thing that is a little puzzling to me is how deduplicating across multiple dimensions is handled. It looks like this function preserves existing dimensions, but inserts NA is the arrays would be ragged? This seems a little strange to me. I think it could make more sense to "flatten" all dimensions in the contained variables into a new dimension when dropping duplicates.

This would require specifying the name for the new dimension(s), but perhaps that could work by switching to the de-duplicated variable name? For example, ds.drop_duplicates('valid') on the example in the PR description would result in a "valid" coordinate/dimension of length 3.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/pull/5089#issuecomment-813168052, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU7FFWCT2NXOR2AYNLGVQDTHEYYFANCNFSM4Z6ZAMUA .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
809814710 https://github.com/pydata/xarray/pull/5089#issuecomment-809814710 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgwOTgxNDcxMA== ahuang11 15331990 2021-03-30T00:27:20Z 2021-04-04T22:26:02Z CONTRIBUTOR

Thanks for the PR @ahuang11 !

I think the method could be really useful. Does anyone else have thoughts?

One important decision is whether this should operate on dimensioned coords or all coords (or even any array?). My guess would be that we could start with dimensioned coords given those are the most likely use case, and we could extent to non-dimensioned coords later.

(here's a glossary as the terms can get confusing: http://xarray.pydata.org/en/stable/terminology.html)

~~Let's start with just dims for now.~~

Okay, since I had some time, I decided to do coords too.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980
809822634 https://github.com/pydata/xarray/pull/5089#issuecomment-809822634 https://api.github.com/repos/pydata/xarray/issues/5089 MDEyOklzc3VlQ29tbWVudDgwOTgyMjYzNA== ahuang11 15331990 2021-03-30T00:52:18Z 2021-03-30T00:52:18Z CONTRIBUTOR

Not sure how to fix this: ```

xarray/core/dataset.py:7111: error: Keywords must be strings Found 1 error in 1 file (checked 138 source files) ```

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Add drop duplicates 842940980

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.869ms · About: xarray-datasette