home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where user = 7360639 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: issue_url, created_at (date), updated_at (date)

issue 2

  • Feature request: time-based rolling window functionality 2
  • to_netcdf very slow for some single character data types 1

user 1

  • snbentley · 3 ✖

author_association 1

  • NONE 3
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
650091343 https://github.com/pydata/xarray/issues/4180#issuecomment-650091343 https://api.github.com/repos/pydata/xarray/issues/4180 MDEyOklzc3VlQ29tbWVudDY1MDA5MTM0Mw== snbentley 7360639 2020-06-26T09:45:39Z 2020-06-26T09:45:39Z NONE

Ah that is a much better compromise - it's still slower for my own much larger dataset but is definitely manageable now. I think that this is what I was trying to find originally when I ended up using |S1.

As the problem was my usage of encoding / netCDF4's slow variable strings and you've given me a good workaround, I'll close this. Thanks for your help!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  to_netcdf very slow for some single character data types 645443880
618332209 https://github.com/pydata/xarray/issues/3216#issuecomment-618332209 https://api.github.com/repos/pydata/xarray/issues/3216 MDEyOklzc3VlQ29tbWVudDYxODMzMjIwOQ== snbentley 7360639 2020-04-23T10:52:40Z 2020-04-23T10:52:40Z NONE

This would still be very useful to me in future - for the piece of work I was referring to here I came up with a workaround. I filled in the gaps roughly with NaNs, so that I could identify and remove outliers and other bad data. Only then could I use the resample functionality without smearing these artefacts across good data.

However, my solution was quite clunky and slow and was based on the still-mostly-regular resolution of my dataset, rather than any neater general solution in pandas. As I was (and am) also relatively new to Python I did not think this was appropriate to add to xarray myself, but I would like to say that I would definitely use this functionality in future - as would the other colleagues in space physics/meteorology I mentioned this to.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature request: time-based rolling window functionality 480753417
521322678 https://github.com/pydata/xarray/issues/3216#issuecomment-521322678 https://api.github.com/repos/pydata/xarray/issues/3216 MDEyOklzc3VlQ29tbWVudDUyMTMyMjY3OA== snbentley 7360639 2019-08-14T16:38:07Z 2019-08-14T16:38:07Z NONE

Hi, I did actually just see this - it would solve the unevenly sampled data part but really I need to identify the unphysical values that are not tagged by the quality flags first. Once that has been done then resampling and interpolation would be great - but otherwise I will be spreading the effect of bad data.

For this particular set of data I am looking at, I often get individual points which are close to but clearly outliers from the time series so examining a rolling mean would help find these. That is the example I was hoping to solve with this query, but I have already realised that this extends to other problems I will encounter. For example, sudden jumps in the time series (for which I have been recommended to calculate rolling correlation coefficients between two time series) and multiple points jumping all over the place (for which I will probably compare the variance of groups of points and a rolling gradient)

(I really don't know why these aren't cleaned better first, but unfortunately that is the way things are)

Because I need to clean the data before any analysis, the resampling method would probably allow me to get rid of most but not all the bad data. Then I would have to be extra-cautious and throw out lots of possibly good observations just in case. I will definitely use resampling for the analysis but there are so many ways that this would be helpful at the processing stage.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Feature request: time-based rolling window functionality 480753417

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 640.163ms · About: xarray-datasette