home / github

Menu
  • Search all tables
  • GraphQL API

issue_comments

Table actions
  • GraphQL API for issue_comments

4 rows where author_association = "CONTRIBUTOR", issue = 290572700 and user = 12465248 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • jmccreight · 4 ✖

issue 1

  • passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument · 4 ✖

author_association 1

  • CONTRIBUTOR · 4 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
489236748 https://github.com/pydata/xarray/issues/1849#issuecomment-489236748 https://api.github.com/repos/pydata/xarray/issues/1849 MDEyOklzc3VlQ29tbWVudDQ4OTIzNjc0OA== jmccreight 12465248 2019-05-03T20:54:50Z 2019-05-03T20:54:50Z CONTRIBUTOR

@dcherian Thanks,

First, I think you're right that the encoding['contiguous']=True is coming from the input file. That was not clear to me (and I did not read the xarray code to verify). But it makes sense.

Second, my example shows something more slightly complicated than the original example which was also not clear to me. In my case the unlimited dimension (time) is chunked and is being successfully written in both cases (before and after work around). The error/ failure is happening on the a variable that contains the unlimited dimension but which has encoding['contiguous']=True for the variable.

This makes sense upon a slightly more nuanced reading of the netcdf4 manual (as quoted my markelg)

"contiguous: if True (default False), the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error."

The last sentence apparently means that for any variable with an unlimited dimension the use of contiguous=True triggers an error. That was not clear to me until I looked a bit harder at this. I think that slightly refines the strategy of how to deal with the problem.

I propose that the solution should be both a) delete encoding['contiguous'] if it is True when asked to write out a variable containing an unlimited dimension. b) raise an informative warning that the variable was chunked because it contained an unlimited dimension. (If a user hates warnings, they could can handle this deletion herself. One the other hand, there's really nothing else to do, so I'm not sure the warning is necessary... I dont have strong opinion on this, but the code is fiddling with the encodings under the hood, so a warning seems polite).

A final question: should the encoding['contiguous'] be removed from the xarray variable or should it just be removed for purposes of writing it to ncdf4 on disk? I suppose a user could be writing the xarray dataset to another format that might allow what netcdf does not allow. This should be an easy detail.

I'll make a PR with the above and we can evaluate the concrete changes.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 290572700
489156658 https://github.com/pydata/xarray/issues/1849#issuecomment-489156658 https://api.github.com/repos/pydata/xarray/issues/1849 MDEyOklzc3VlQ29tbWVudDQ4OTE1NjY1OA== jmccreight 12465248 2019-05-03T16:25:19Z 2019-05-03T16:40:52Z CONTRIBUTOR

Here's what I understand so far. For my file, i write it with ("ensured") and without ("unensured") the workaround (actually @markelg for discovering this).

(base) jamesmcc@cheyenne3[1021]:/glade/scratch/jamesmcc/florence_cutout_routelink_ensemble_run/ensemble> grep '_Storage' ensured_ncdsh.txt feature_id:_Storage = "contiguous" ; latitude:_Storage = "contiguous" ; longitude:_Storage = "contiguous" ; time:_Storage = "chunked" ; member:_Storage = "contiguous" ; crs:_Storage = "chunked" ; order:_Storage = "chunked" ; elevation:_Storage = "chunked" ; streamflow:_Storage = "chunked" ; q_lateral:_Storage = "chunked" ; velocity:_Storage = "chunked" ; Head:_Storage = "chunked" ; (base) jamesmcc@cheyenne3[1022]:/glade/scratch/jamesmcc/florence_cutout_routelink_ensemble_run/ensemble> grep '_Storage' unensured_ncdsh.txt feature_id:_Storage = "contiguous" ; latitude:_Storage = "contiguous" ; longitude:_Storage = "contiguous" ; time:_Storage = "chunked" ; member:_Storage = "contiguous" ; crs:_Storage = "chunked" ;

The error that is thrown is, just the tail end of it:

``` /glade/p/cisl/nwc/jamesmcc/anaconda3/lib/python3.7/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 466 least_significant_digit=encoding.get( 467 'least_significant_digit'), --> 468 fill_value=fill_value) 469 _disable_auto_decode_variable(nc4_var) 470

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable()

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.init()

netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()

RuntimeError: NetCDF: Invalid argument ```

If I go to line 464 in xarray/backends/netCDF4_.py, I see that the variable it is failing on is crs. If I

print(name) crs encoding.get('contiguous', False) True but the ncdump -sh shows it's actually chunked. I'm not sure this is exactly what's raising the error down the line, but these two things seem to be at odds.

My current question is "why does encoding.get('contiguous', False) return True?"

If you have any insights let me know. I probably wont have time to mess with this until next week.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 290572700
488865903 https://github.com/pydata/xarray/issues/1849#issuecomment-488865903 https://api.github.com/repos/pydata/xarray/issues/1849 MDEyOklzc3VlQ29tbWVudDQ4ODg2NTkwMw== jmccreight 12465248 2019-05-02T23:19:14Z 2019-05-02T23:19:14Z CONTRIBUTOR

I could be persuaded.

I just dont understand how 'contiguous' gets set on the encoding of these variables and if that is appropriate. Does that seem obvious/clear to anyone?

I still dont understand why this is happening for me. I made some fairly small modifications to some code that never threw this error in the past. The small mods could have done it, but the identical code on my laptop did not throw this error on a small sample dataset. Then I went to cheyenne, where all bets are off!

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 290572700
488841260 https://github.com/pydata/xarray/issues/1849#issuecomment-488841260 https://api.github.com/repos/pydata/xarray/issues/1849 MDEyOklzc3VlQ29tbWVudDQ4ODg0MTI2MA== jmccreight 12465248 2019-05-02T21:36:41Z 2019-05-02T21:36:41Z CONTRIBUTOR

I apparently have this problem too. Thanks @gerritholl for the workaround.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  passing unlimited_dims to to_netcdf triggers RuntimeError: NetCDF: Invalid argument 290572700

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.03ms · About: xarray-datasette