home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

3 rows where author_association = "NONE", issue = 548475127 and user = 3922329 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 1

  • dmedv · 3 ✖

issue 1

  • Different data values from xarray open_mfdataset when using chunks · 3 ✖

author_association 1

  • NONE · 3 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
573455625 https://github.com/pydata/xarray/issues/3686#issuecomment-573455625 https://api.github.com/repos/pydata/xarray/issues/3686 MDEyOklzc3VlQ29tbWVudDU3MzQ1NTYyNQ== dmedv 3922329 2020-01-12T20:48:20Z 2020-01-12T20:51:01Z NONE

Actually, there is no need to separate them. One can simply do something like this to apply the mask: ds.analysed_sst.where(ds.analysed_sst != fill_value).mean() * scale_factor + offset It's not a bug, but if we set mask_and_scale=False, it's left up to us to apply the mask manually.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Different data values from xarray open_mfdataset when using chunks  548475127
573451230 https://github.com/pydata/xarray/issues/3686#issuecomment-573451230 https://api.github.com/repos/pydata/xarray/issues/3686 MDEyOklzc3VlQ29tbWVudDU3MzQ1MTIzMA== dmedv 3922329 2020-01-12T19:59:31Z 2020-01-12T20:25:16Z NONE

@abarciauskas-bgse Yes, indeed, I forgot about _FillValue. That would mess up the mean calculation with mask_and_scale=False. I think it would be nice if it were possible to control the mask application in open_dataset separately from scale/offset.

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Different data values from xarray open_mfdataset when using chunks  548475127
573380688 https://github.com/pydata/xarray/issues/3686#issuecomment-573380688 https://api.github.com/repos/pydata/xarray/issues/3686 MDEyOklzc3VlQ29tbWVudDU3MzM4MDY4OA== dmedv 3922329 2020-01-12T04:18:43Z 2020-01-12T04:27:23Z NONE

Actually, that's true not just for open_mfdataset, but even for open_dataset with a single file. I've tried it with one of those files from PO.DAAC, and got similar results - slightly different values depending on the chunking strategy.

Just a guess, but I think the problem here is that the calculations are done in floating-point arithmetic (probably float32...), and you get accumulated precision errors depending on the number of chunks.

Internally in the NetCDF file analysed_sst values are stored as int16, with real scale and offset values, so the correct way to calculate the mean would be to do it in original int16, and then apply scale and offset to the result. Automatic scaling is on by default (i.e. it will replace original array values with new scaled values), but you can turn it off in open_dataset with the mask_and_scale=False option: http://xarray.pydata.org/en/stable/generated/xarray.open_dataset.html I tried doing this, and then I got identical results with chunked and unchunked versions. Can pass this option to open_mfdataset as well with **kwargs.

I'm basically just starting to use xarray myself, so please someone correct me if any of the above is wrong.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Different data values from xarray open_mfdataset when using chunks  548475127

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.87ms · About: xarray-datasette