home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where author_association = "MEMBER" and issue = 258500654 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • jhamman 2
  • fmaussion 2
  • shoyer 1

issue 1

  • Variable of dtype int8 casted to float64 · 5 ✖

author_association 1

  • MEMBER · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
330277364 https://github.com/pydata/xarray/issues/1576#issuecomment-330277364 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI3NzM2NA== jhamman 2443309 2017-09-18T16:26:36Z 2017-09-18T16:26:36Z MEMBER

Why can't xarray used masked arrays, that would retain the original dtype?

We have an open issue for this topic (#1194). A lot of it comes down to performance, dask is part of that but the other issue is that masked arrays in numpy are quite slow.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330271312 https://github.com/pydata/xarray/issues/1576#issuecomment-330271312 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI3MTMxMg== shoyer 1217238 2017-09-18T16:04:47Z 2017-09-18T16:04:47Z MEMBER

We currently decode anything with a _FillValue attribute to float, so that we can convert any values equal to the fill value to NaN. This ensure's that xarray's NaN skipping aggregations (e.g., mean()) work properly.

However, this isn't really a useful thing to do for a dataset like this where the values really represent enums/categories. It seems like the CF compliant way to indicate this is with the various flag_* attributes. So we could look for those to indicate that we shouldn't fill-in fill values.

Eventually, we could possibly also use this for decoding into a true "categorical" dtype, but numpy doesn't have anything like that yet.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330271058 https://github.com/pydata/xarray/issues/1576#issuecomment-330271058 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI3MTA1OA== jhamman 2443309 2017-09-18T16:03:49Z 2017-09-18T16:03:49Z MEMBER

Right, since xarray uses np.nan as its fill value, any array with a _FillValue will be promoted to a float dtype.

Out of curiosity, what is the meaning _NoFill = "true"?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330263190 https://github.com/pydata/xarray/issues/1576#issuecomment-330263190 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI2MzE5MA== fmaussion 10050469 2017-09-18T15:38:49Z 2017-09-18T15:38:49Z MEMBER

OK. I'll let @shoyer comment on the substance but indeed it seems that decode_cf could be cleverer here. It should be an easy fix.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654
330243618 https://github.com/pydata/xarray/issues/1576#issuecomment-330243618 https://api.github.com/repos/pydata/xarray/issues/1576 MDEyOklzc3VlQ29tbWVudDMzMDI0MzYxOA== fmaussion 10050469 2017-09-18T14:37:06Z 2017-09-18T14:37:20Z MEMBER

Can you run ncdump -h -s on the file an report back?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Variable of dtype int8 casted to float64 258500654

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 1763.936ms · About: xarray-datasette