home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

16 rows where issue = 195832230 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 7

  • shoyer 5
  • laliberte 4
  • fmaussion 2
  • thenaomig 2
  • rabernat 1
  • jhamman 1
  • mmartini-usgs 1

author_association 3

  • MEMBER 9
  • CONTRIBUTOR 4
  • NONE 3

issue 1

  • Set a default _FillValue of NaN for float types · 16 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
332833849 https://github.com/pydata/xarray/pull/1165#issuecomment-332833849 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDMzMjgzMzg0OQ== mmartini-usgs 23199378 2017-09-28T13:20:12Z 2017-09-28T13:20:12Z NONE

It is not desirable for us to have _FillValue = NaN for dimensions and coordinate variables.

In trying to use xarray, _FillValue was carefully kept from these variables and dimensions during the creation of the un-resampled file and then were found to appear during the to_netcdf operation. This happens in spite of mask_and_scale=False is being used with xr.open_dataset

I would hope that downstream code would have trouble with coordinates that don't make logical sense (time or place being NaN, for instance). We would prefer NOT to instantiate coordinate variable data with any fill value. Keeping NaNs out of coordinate variables, dimensions and minima and maxima is part of our QA/QC process to avoid downstream issues.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
321890019 https://github.com/pydata/xarray/pull/1165#issuecomment-321890019 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDMyMTg5MDAxOQ== thenaomig 11436996 2017-08-11T18:45:14Z 2017-08-11T18:45:14Z NONE

At the moment, I am only concerned with coordinate variables. I suppose it would be less likely to come up otherwise, but someone might want to leave it off for any variable.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
321887867 https://github.com/pydata/xarray/pull/1165#issuecomment-321887867 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDMyMTg4Nzg2Nw== jhamman 2443309 2017-08-11T18:35:36Z 2017-08-11T18:35:36Z MEMBER

@thenaomig - are you only concerned about coordinate variables (I think your gitter post mentioned this)? We may also consider special casing these as well given the CF standard on this issue.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
321886294 https://github.com/pydata/xarray/pull/1165#issuecomment-321886294 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDMyMTg4NjI5NA== shoyer 1217238 2017-08-11T18:28:34Z 2017-08-11T18:28:34Z MEMBER

Is there a way to override this default behavior and write a netCDF file without including _FillValue in the metadata? For example, if one were adhering to a requested output format that does not include this?

This is a good question. I don't think so currently.

In principle we could use a entry in the encoding dict to signal this, i.e., {'_FillValue': None}.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
321882272 https://github.com/pydata/xarray/pull/1165#issuecomment-321882272 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDMyMTg4MjI3Mg== thenaomig 11436996 2017-08-11T18:10:45Z 2017-08-11T18:10:45Z NONE

Is there a way to override this default behavior and write a netCDF file without including _FillValue in the metadata? For example, if one were adhering to a requested output format that does not include this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
268942718 https://github.com/pydata/xarray/pull/1165#issuecomment-268942718 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2ODk0MjcxOA== shoyer 1217238 2016-12-23T05:29:16Z 2016-12-23T05:29:16Z MEMBER

@laliberte we missed giving your credit for this fix in "What's new". Any interest in putting together another quick PR adding "By Frederic Laliberte"?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267507053 https://github.com/pydata/xarray/pull/1165#issuecomment-267507053 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzUwNzA1Mw== shoyer 1217238 2016-12-16T03:11:11Z 2016-12-16T03:11:11Z MEMBER

Thanks!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267492795 https://github.com/pydata/xarray/pull/1165#issuecomment-267492795 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzQ5Mjc5NQ== laliberte 3217406 2016-12-16T01:21:30Z 2016-12-16T01:21:30Z CONTRIBUTOR

@shoyer Let me know if you need any more changes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267490261 https://github.com/pydata/xarray/pull/1165#issuecomment-267490261 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzQ5MDI2MQ== shoyer 1217238 2016-12-16T01:03:32Z 2016-12-16T01:03:32Z MEMBER

OK, I'm fine with only doing this for real types. Complex types aren't really even valid in standard netCDF files (h5py uses a custom data type), so I'm not worried about compatibility with other tools.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267392582 https://github.com/pydata/xarray/pull/1165#issuecomment-267392582 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM5MjU4Mg== laliberte 3217406 2016-12-15T17:43:21Z 2016-12-15T17:43:21Z CONTRIBUTOR

So I've tried with conventions.py and it works. However, allowing a NaN _FillValue for complex floats raises an exception in h5py: h5py/h5py#805. This means that until this is fixed in h5py we should restrict this default behaviour to real float types.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267380946 https://github.com/pydata/xarray/pull/1165#issuecomment-267380946 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM4MDk0Ng== shoyer 1217238 2016-12-15T16:58:08Z 2016-12-15T16:58:08Z MEMBER

I agree that we just want to set _FillValue=NaN unilaterally for float arrays. This will make for slightly less tidy netCDF files, but we really don't want to be checking arrays for NaNs when we write them to disk, which would be untenable with dask.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267380533 https://github.com/pydata/xarray/pull/1165#issuecomment-267380533 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM4MDUzMw== fmaussion 10050469 2016-12-15T16:56:40Z 2016-12-15T16:56:40Z MEMBER

Yes, agreed, LGTM.

I'll let @shoyer have the final word though

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267371351 https://github.com/pydata/xarray/pull/1165#issuecomment-267371351 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM3MTM1MQ== laliberte 3217406 2016-12-15T16:23:41Z 2016-12-15T16:23:41Z CONTRIBUTOR

I know but the alternative would be to scan the data before deciding but that would kind of break the whole dask integration.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267370940 https://github.com/pydata/xarray/pull/1165#issuecomment-267370940 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM3MDk0MA== fmaussion 10050469 2016-12-15T16:22:24Z 2016-12-15T16:22:24Z MEMBER

The only problem I see is that in the current implementation this fix puts a _FillValue attribute even if there are no NaN in the variable.

Yes, I'm not very enthusiastic about that either...

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267366770 https://github.com/pydata/xarray/pull/1165#issuecomment-267366770 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM2Njc3MA== laliberte 3217406 2016-12-15T16:07:45Z 2016-12-15T16:07:45Z CONTRIBUTOR

The only problem I see is that in the current implementation this fix puts a _FillValue attribute even if there are no NaN in the variable. Usually, cdo would not do that. I guess there are little risks but I think it's important that this be said. Alternative implementations would be way too cumbersome so I think the approach used in this PR is the lesser of two evils.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230
267365730 https://github.com/pydata/xarray/pull/1165#issuecomment-267365730 https://api.github.com/repos/pydata/xarray/issues/1165 MDEyOklzc3VlQ29tbWVudDI2NzM2NTczMA== rabernat 1197350 2016-12-15T16:04:10Z 2016-12-15T16:04:10Z MEMBER

I can't see any problems with this.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Set a default _FillValue of NaN for float types 195832230

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 16.589ms · About: xarray-datasette