issue_comments: 521322678
This data as json
html_url | issue_url | id | node_id | user | created_at | updated_at | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
https://github.com/pydata/xarray/issues/3216#issuecomment-521322678 | https://api.github.com/repos/pydata/xarray/issues/3216 | 521322678 | MDEyOklzc3VlQ29tbWVudDUyMTMyMjY3OA== | 7360639 | 2019-08-14T16:38:07Z | 2019-08-14T16:38:07Z | NONE | Hi, I did actually just see this - it would solve the unevenly sampled data part but really I need to identify the unphysical values that are not tagged by the quality flags first. Once that has been done then resampling and interpolation would be great - but otherwise I will be spreading the effect of bad data. For this particular set of data I am looking at, I often get individual points which are close to but clearly outliers from the time series so examining a rolling mean would help find these. That is the example I was hoping to solve with this query, but I have already realised that this extends to other problems I will encounter. For example, sudden jumps in the time series (for which I have been recommended to calculate rolling correlation coefficients between two time series) and multiple points jumping all over the place (for which I will probably compare the variance of groups of points and a rolling gradient) (I really don't know why these aren't cleaned better first, but unfortunately that is the way things are) Because I need to clean the data before any analysis, the resampling method would probably allow me to get rid of most but not all the bad data. Then I would have to be extra-cautious and throw out lots of possibly good observations just in case. I will definitely use resampling for the analysis but there are so many ways that this would be helpful at the processing stage. |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
480753417 |