github: issue_comments: 11 rows where issue = 202964277 sorted by updated

11 rows where issue = 202964277 sorted by updated_at descending

Search:

descending

id	html_url	issue_url	node_id	user	created_at	updated_at ▲	author_association	body	reactions	issue
343335659	https://github.com/pydata/xarray/issues/1225#issuecomment-343335659	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDM0MzMzNTY1OQ==	shoyer 1217238	2017-11-10T00:23:32Z	2017-11-10T00:23:32Z	MEMBER	Doing some digging, it turns out this turned up quite a while ago back in #156 where we added some code to fix this. Looking at @tbohn's dataset, the problem variable is actually the coordinate variable `'time'` corresponding to the unlimited dimension: ``` In [7]: ds.variables['time'] Out[7]: <class 'netCDF4._netCDF4.Variable'> int32 time(time) units: days since 2000-01-01 00:00:00.0 unlimited dimensions: time current shape = (5,) filling on, default _FillValue of -2147483647 used In [8]: ds.variables['time'].chunking() Out[8]: [1048576] In [9]: 2 20 Out[9]: 1048576 In [10]: ds.dimensions Out[10]: OrderedDict([('veg_class', <class 'netCDF4._netCDF4.Dimension'>: name = 'veg_class', size = 19), ('lat', <class 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 160), ('lon', <class 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 160), ('time', <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'time', size = 5)]) ``` For some reason netCDF4 gives it a chunking of 2 20, even though it only has length 5. This leads to an error when we write a file back with the original chunking.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
343332976	https://github.com/pydata/xarray/issues/1225#issuecomment-343332976	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDM0MzMzMjk3Ng==	cwerner 13906519	2017-11-10T00:07:24Z	2017-11-10T00:07:24Z	NONE	Thanks for that Stephan. The workaround looks good for the moment ;-)... Detecting a mismatch (and maybe even correcting it) automatically would be very useful cheers, C	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
343332081	https://github.com/pydata/xarray/issues/1225#issuecomment-343332081	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDM0MzMzMjA4MQ==	shoyer 1217238	2017-11-10T00:02:07Z	2017-11-10T00:02:07Z	MEMBER	@chrwerner Sorry to hear about your trouble, I will take another look at this. Right now, your best bet is probably something like: `python def clean_dataset(ds): for var in ds.variables.values(): if 'chunksizes' in var.encoding: del var.encoding['chunksizes']`	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
343325842	https://github.com/pydata/xarray/issues/1225#issuecomment-343325842	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDM0MzMyNTg0Mg==	cwerner 13906519	2017-11-09T23:28:28Z	2017-11-09T23:28:28Z	NONE	Is there any news on this? Have the same problem. A reset_chunksizes() method would be very helpful. Also, what is the cleanest way to remove all chunk size info? I have a very long computation and it fails at the very end with the mentioned error message. My file is patched together from many sources... cheers	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
326146218	https://github.com/pydata/xarray/issues/1225#issuecomment-326146218	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMyNjE0NjIxOA==	tbohn 3496314	2017-08-30T23:23:16Z	2017-08-30T23:23:16Z	NONE	OK, thanks Joe and Stephan. On Wed, Aug 30, 2017 at 3:36 PM, Joe Hamman notifications@github.com wrote: @tbohn https://github.com/tbohn - What is happening here is that xarray is storing the netCDF4 chunk size from the input file. For the LAI variable in your example, that isLAI:_ChunkSizes = 19, 1, 160, 160 ; (you can see this with ncdump -h -s filename.nc). $ ncdump -s -h veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates.nc netcdf veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates { dimensions: veg_class = 19 ; lat = 160 ; lon = 160 ; time = UNLIMITED ; // (5 currently) variables: float Cv(veg_class, lat, lon) ; Cv:_FillValue = -1.f ; Cv:units = "-" ; Cv:longname = "Area Fraction" ; Cv:missing_value = -1.f ; Cv:_Storage = "contiguous" ; Cv:_Endianness = "little" ; float LAI(veg_class, time, lat, lon) ; LAI:_FillValue = -1.f ; LAI:units = "m2/m2" ; LAI:longname = "Leaf Area Index" ; LAI:missing_value = -1.f ; LAI:_Storage = "chunked" ; LAI:_ChunkSizes = 19, 1, 160, 160 ; LAI:_Endianness = "little" ; ... Those integers correspond to the dimensions from LAI. When you slice your dataset, you end up with lat/lon dimensions that are now smaller than the _ChunkSizes. When writing this back to netCDF, xarray is still trying to use the original encoding attribute. The logical fix is to validate this encoding attribute and either 1) throw an informative error if something isn't going to work, or 2) change the ChunkSizes. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1225#issuecomment-326138431, or mute the thread https://github.com/notifications/unsubscribe-auth/ADVZeo0qPYlMc_a8UeGDNp04jtFXqkgOks5sdePhgaJpZM4Ls47i .	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
326138431	https://github.com/pydata/xarray/issues/1225#issuecomment-326138431	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMyNjEzODQzMQ==	jhamman 2443309	2017-08-30T22:36:14Z	2017-08-30T22:36:14Z	MEMBER	@tbohn - What is happening here is that xarray is storing the netCDF4 chunk size from the input file. For the `LAI` variable in your example, that is`LAI:_ChunkSizes = 19, 1, 160, 160 ;` (you can see this with `ncdump -h -s filename.nc`). shell $ ncdump -s -h veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates.nc netcdf veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates { dimensions: veg_class = 19 ; lat = 160 ; lon = 160 ; time = UNLIMITED ; // (5 currently) variables: float Cv(veg_class, lat, lon) ; Cv:_FillValue = -1.f ; Cv:units = "-" ; Cv:longname = "Area Fraction" ; Cv:missing_value = -1.f ; Cv:_Storage = "contiguous" ; Cv:_Endianness = "little" ; float LAI(veg_class, time, lat, lon) ; LAI:_FillValue = -1.f ; LAI:units = "m2/m2" ; LAI:longname = "Leaf Area Index" ; LAI:missing_value = -1.f ; LAI:_Storage = "chunked" ; LAI:_ChunkSizes = 19, 1, 160, 160 ; LAI:_Endianness = "little" ; ... Those integers correspond to the dimensions from LAI. When you slice your dataset, you end up with lat/lon dimensions that are now smaller than the `_ChunkSizes`. When writing this back to netCDF, xarray is still trying to use the original `encoding` attribute. The logical fix is to validate this encoding attribute and either 1) throw an informative error if something isn't going to work, or 2) change the `ChunkSizes`.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
307524160	https://github.com/pydata/xarray/issues/1225#issuecomment-307524160	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMwNzUyNDE2MA==	tbohn 3496314	2017-06-09T23:32:38Z	2017-08-30T22:26:44Z	NONE	OK, here's my code and the file that it works (fails) on. Code: ```Python import os.path import numpy as np import xarray as xr ds = xr.open_dataset('veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates.nc') ds_out = ds.isel(lat=slice(0,16),lon=slice(0,16)) ds_out.encoding['unlimited_dims'] = 'time' ds_out.to_netcdf('test.out.nc') ``` Note that I commented out the attempt to make 'time' unlimited - if I attempt it, I get a slightly different chunk size error ('NetCDF: Bad chunk sizes'). I realize that for now I can use 'ncks' as a workaround, but seems to me that xarray should be able to do this too. File (attached) veg_hist.0_10n.90_80w.2000_2016.mode_PFT.5dates.nc.zip	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
307524406	https://github.com/pydata/xarray/issues/1225#issuecomment-307524406	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMwNzUyNDQwNg==	tbohn 3496314	2017-06-09T23:34:44Z	2017-06-09T23:34:44Z	NONE	(note also that for the example nc file I provided, the slice that my example code makes contains nothing but null values - but that's irrelevant - the error happens for other slices that do contain non-null values.)	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
307519054	https://github.com/pydata/xarray/issues/1225#issuecomment-307519054	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMwNzUxOTA1NA==	shoyer 1217238	2017-06-09T23:02:20Z	2017-06-09T23:02:20Z	MEMBER	@tbohn "self-contained" just means something that I can run on my machine. For example, the code above plus the "somefile.nc" netCDF file that I can load to reproduce this example. Thinking about this a little more, I think the issue is somehow related to the `encoding['chunksizes']` property on the Dataset variables loaded from the original netCDF file. Something like this should work as a work-around: `del myds.var.encoding['chunksizes']` The bug is somewhere in our handling of chunksize encoding for netCDF4, but it is difficult to fix it without being able to run code that reproduces it.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
307518173	https://github.com/pydata/xarray/issues/1225#issuecomment-307518173	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMwNzUxODE3Mw==	tbohn 3496314	2017-06-09T22:55:20Z	2017-06-09T22:55:20Z	NONE	I've been encountering this as well, and I don't want to use the scipy engine workaround. If you can tell me what a "self-contained" example means, I can also try to provide one.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277
306620537	https://github.com/pydata/xarray/issues/1225#issuecomment-306620537	https://api.github.com/repos/pydata/xarray/issues/1225	MDEyOklzc3VlQ29tbWVudDMwNjYyMDUzNw==	jgerardsimcock 6101444	2017-06-06T21:19:21Z	2017-06-06T21:19:21Z	NONE	I've also just encountered this. Will try to to reproduce a self-contained example.	{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 }	“ValueError: chunksize cannot exceed dimension size” when trying to write xarray to netcdf 202964277

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);

issue_comments

11 rows where issue = 202964277 sorted by updated_at descending

ds_out.encoding['unlimited_dims'] = 'time'

Advanced export