home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

6 rows where author_association = "MEMBER" and issue = 261403591 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 2

  • shoyer 4
  • jhamman 2

issue 1

  • Need better user control of _FillValue attribute in NetCDF files · 6 ✖

author_association 1

  • MEMBER · 6 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
333224978 https://github.com/pydata/xarray/issues/1598#issuecomment-333224978 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMzIyNDk3OA== shoyer 1217238 2017-09-29T20:01:43Z 2017-09-29T20:01:43Z MEMBER

It sounds like we should control this in xarray to ensure consistent behavior.

{
    "total_count": 3,
    "+1": 3,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591
333171129 https://github.com/pydata/xarray/issues/1598#issuecomment-333171129 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMzE3MTEyOQ== jhamman 2443309 2017-09-29T16:17:32Z 2017-09-29T16:17:32Z MEMBER

@dnowacki-usgs - you've made a good point. At least for the netCDF4 backend, this seems to work out of the box with None/False. Can someone check that this works for the scipy/h5netcdf backends?

https://github.com/Unidata/netcdf4-python/blob/366debfff8b0bc53999c9e1ce9f4818bf7cf079a/netCDF4/_netCDF4.pyx#L3455-L3457

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591
332950475 https://github.com/pydata/xarray/issues/1598#issuecomment-332950475 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMjk1MDQ3NQ== shoyer 1217238 2017-09-28T20:12:05Z 2017-09-28T20:12:05Z MEMBER

Agreed, None is probably better. There is no such thing as a "null" dtype.

On Thu, Sep 28, 2017 at 1:10 PM Joe Hamman notifications@github.com wrote:

I actually think we should use None as the _FillValue sentinel value. We do (sort of) support boolean arrays (#849 https://github.com/pydata/xarray/pull/849).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1598#issuecomment-332950001, or mute the thread https://github.com/notifications/unsubscribe-auth/ABKS1pR8YDZ9-Sw_cm4ckTI4XAV45UlOks5sm_0mgaJpZM4Pnox9 .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591
332950001 https://github.com/pydata/xarray/issues/1598#issuecomment-332950001 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMjk1MDAwMQ== jhamman 2443309 2017-09-28T20:10:13Z 2017-09-28T20:10:13Z MEMBER

I actually think we should use None as the _FillValue sentinel value. We do (sort of) support boolean arrays (https://github.com/pydata/xarray/pull/849).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591
332949221 https://github.com/pydata/xarray/issues/1598#issuecomment-332949221 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMjk0OTIyMQ== shoyer 1217238 2017-09-28T20:07:15Z 2017-09-28T20:07:15Z MEMBER

There is also the philosophical problem of fill values for coordinate variables.

Indeed, this is prohibited by CF conventions -- but xarray (like pandas) takes a more flexible approach here, allowing for missing values for all variables.

You can already specify an explicit choice for _FillValue, e.g., ds.to_netcdf(..., encoding={'my_variable': {'_FillValue': 1e35}}). Allowing {'_FillValue': False} to indicate that _FillValue should not be included would be a simple, easy fix, so we should probably do that regardless.

(There is no need worry about False conflicting with a legitimate fill value since netCDF does not have a boolean dtype.)

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591
332934061 https://github.com/pydata/xarray/issues/1598#issuecomment-332934061 https://api.github.com/repos/pydata/xarray/issues/1598 MDEyOklzc3VlQ29tbWVudDMzMjkzNDA2MQ== shoyer 1217238 2017-09-28T19:05:46Z 2017-09-28T19:05:46Z MEMBER

cc @thenaomig @laliberte

There are at least two ways to fix this: 1. Support a flag of some sort in encoding (e.g., _FillValue = False) to indicate that fill value shouldn't be added. This would be easy to add, but is somewhat inelegant. 2. Check for the presence of NaNs before setting _FillValue = NaN. This would be easy to add for dimension coordinates because they are already guaranteed to be in memory, but could cause performance trouble if any inputs are loaded as dask arrays. I don't know a satisfactory way to handle dask arrays with our current design, since we don't want to add another pass over the data to check for NaNs. I suppose one option would be to refactor our backend classes to write data before writing attributes and then make some sort of dask array operation that checks for NaNs as the data is written. But I'm not even sure this would work with the standard dask task schedulers.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Need better user control of _FillValue attribute in NetCDF files 261403591

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 13.112ms · About: xarray-datasette