html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/5106#issuecomment-822968365,https://api.github.com/repos/pydata/xarray/issues/5106,822968365,MDEyOklzc3VlQ29tbWVudDgyMjk2ODM2NQ==,40218891,2021-04-20T04:41:07Z,2021-04-20T04:41:07Z,NONE,"I am closing this issue. It is impossible to guess the proper time unit when dealing with missing data. Setting the attribute explicitly is a better solution. A minor quibble: the statement ``ds1.reftime.encoding['units'] = 'hours since Big Bang'`` rises an exception ``AttributeError: 'NoneType' object has no attribute 'groups'`` It should be ``ValueError: invalid time units: hours since Big Bang`` the same as in the case ``'hours after 1970-01-01'``","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,849751721 https://github.com/pydata/xarray/issues/5106#issuecomment-822108756,https://api.github.com/repos/pydata/xarray/issues/5106,822108756,MDEyOklzc3VlQ29tbWVudDgyMjEwODc1Ng==,40218891,2021-04-19T01:27:02Z,2021-04-19T01:28:38Z,NONE,"When the time dimension of the dataset being appended to is 1, the inferred unit is ""days"". This happens on line 318 in file conding/times.py. In this case variable ``timedeltas`` is an empty array and ``np.all`` evaluates to True: ``` np.all(np.array([]) % 86400000000000 == 0) True ``` (which surprised me, by the way). When I forced ``_infer_time_units_from_diff`` to return ""hours"", the time coordinate in my example is evaluated correctly, so I think this particular code is the cause for the error. Since the fallback return value is set to ""seconds"", I would argue that the case of empty ``timedeltas`` should be set to ""seconds"" as well. Are there alternatives or I should go ahead and create a pull request? ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,849751721 https://github.com/pydata/xarray/issues/4830#issuecomment-767170277,https://api.github.com/repos/pydata/xarray/issues/4830,767170277,MDEyOklzc3VlQ29tbWVudDc2NzE3MDI3Nw==,40218891,2021-01-25T23:06:00Z,2021-01-25T23:06:00Z,NONE,"One could always set *source* to ``str(filename_or_object)``. In this case: ``` import s3fs s3 = s3fs.S3FileSystem(anon=True) s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1980/WRFDS_1980-01-02.nc' fileset = s3.open(s3path) fileset fileset.path ``` prints ``` 'wrf-se-ak-ar5/gfdl/hist/daily/1980/WRFDS_1980-01-02.nc' ``` It is easy to parse the above ``fileset`` representation, but there is no guarantee that some other external file representation will be amenable to parsing. If the fix is only for *s3fs*, getting ``path`` attribute is more elegant, however this would require xarray to be aware of the module.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789653499 https://github.com/pydata/xarray/issues/4822#issuecomment-762438483,https://api.github.com/repos/pydata/xarray/issues/4822,762438483,MDEyOklzc3VlQ29tbWVudDc2MjQzODQ4Mw==,40218891,2021-01-18T19:39:34Z,2021-01-18T19:39:34Z,NONE,"You might be right. Adding ``-k nc4`` works when *string* is removed from attribute specification. If it is present, the error is as before: ``AttributeError: 'numpy.ndarray' object has no attribute 'split'``. However, after changing my AWS script to ``` import s3fs import xarray as xr s3 = s3fs.S3FileSystem(anon=True) s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1988/WRFDS_1988-04-23.nc' ds = xr.open_dataset(s3.open(s3path), engine='scipy') print(ds) ``` the error is ``TypeError: Error: None is not a valid NetCDF 3 file``. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436 https://github.com/pydata/xarray/issues/4822#issuecomment-762423707,https://api.github.com/repos/pydata/xarray/issues/4822,762423707,MDEyOklzc3VlQ29tbWVudDc2MjQyMzcwNw==,40218891,2021-01-18T19:03:19Z,2021-01-18T19:03:19Z,NONE,"This is how I did it: ``` $ ncdump /tmp/x.nc netcdf x { dimensions: x = 1 ; y = 1 ; variables: int foo(y, x) ; foo:coordinates = ""x y"" ; data: foo = 0 ; } $ rm x.nc $ ncgen -o x.nc < x.cdl $ python -c ""import xarray as xr; ds = xr.open_dataset('/tmp/x.nc', engine='h5netcdf'); print(ds)"" ``` Engine *netcdf4* works fine, with *string* or without. My original code retrieving data from AWS: ``` import s3fs import xarray as xr s3 = s3fs.S3FileSystem(anon=True) s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1988/WRFDS_1988-04-23.nc' ds = xr.open_dataset(s3.open(s3path)) print(ds) ``` Adding ``decode_cf=False`` is a workaround. All attributes are arrays: ``` Attributes: contact: ['rtladerjr@alaska.edu'] info: ['Alaska CASC'] data: ['Downscaled GFDL-CM3'] format: ['version 2'] date: ['Mon Jul 1 15:17:16 AKDT 2019'] ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436 https://github.com/pydata/xarray/issues/4822#issuecomment-762376418,https://api.github.com/repos/pydata/xarray/issues/4822,762376418,MDEyOklzc3VlQ29tbWVudDc2MjM3NjQxOA==,40218891,2021-01-18T17:12:53Z,2021-01-18T17:19:53Z,NONE,Dropping *string* changes error to ``Unable to open file (file signature not found)``. This issue popped up while reading data from https://registry.opendata.aws/wrf-se-alaska-snap/ ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436 https://github.com/pydata/xarray/issues/2871#issuecomment-481033093,https://api.github.com/repos/pydata/xarray/issues/2871,481033093,MDEyOklzc3VlQ29tbWVudDQ4MTAzMzA5Mw==,40218891,2019-04-08T22:35:10Z,2019-04-08T22:35:10Z,NONE,"After rethinking the issue, I would drop it: one can simply pass `ds.fromkeys(ds.data_vars.keys(), {})` as the `encoding` attribute. Going back to the original problem. The fix above is not enough, the `SerializationWarning` is still present. An alternative, provided that `missing_value` attribute is still considered deprecated: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.1/build/cf-conventions.html#missing-data, would be to replace it by `_FillValue` on decoding: ``` $ diff variables.py variables.py.orig 179,180d178 < if 'FillValue' not in encoding: < encoding['_FillValue'] = encoding.pop('missing_value')`` ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,429914958 https://github.com/pydata/xarray/issues/2871#issuecomment-480475645,https://api.github.com/repos/pydata/xarray/issues/2871,480475645,MDEyOklzc3VlQ29tbWVudDQ4MDQ3NTY0NQ==,40218891,2019-04-06T05:24:52Z,2019-04-06T05:24:52Z,NONE,"Indeed it works. Thanks. My quick fix: ``` $ diff variables.py variables.py.orig 152,155d151 < elif encoding.get('missing_value') is not None: < fill_value = pop_to(encoding, attrs, 'missing_value', name=name) < if not pd.isnull(fill_value): < data = duck_array_ops.fillna(data, fill_value) ``` I also figured out how to write back floating point values: `encoding=None` means use existing values, so specifying `encoding={'tmpk': {}}` in `to_netcdf()` did the trick. Should there be an option for this? What you see on the screen is not what you get in the file.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,429914958 https://github.com/pydata/xarray/issues/2554#issuecomment-455351725,https://api.github.com/repos/pydata/xarray/issues/2554,455351725,MDEyOklzc3VlQ29tbWVudDQ1NTM1MTcyNQ==,40218891,2019-01-17T22:13:52Z,2019-01-17T22:13:52Z,NONE,After upgrading to anaconda python 3.7 the code works without crashes. I think this issue can be closed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-439281383,https://api.github.com/repos/pydata/xarray/issues/2554,439281383,MDEyOklzc3VlQ29tbWVudDQzOTI4MTM4Mw==,40218891,2018-11-16T04:50:43Z,2018-11-16T04:50:43Z,NONE,"The error `RuntimeError: NetCDF: Bad chunk sizes.` is unrelated to the original problem with segv crashes. It is caused by a bug in netcdf4 C library. It is fixed in the latest version 4.6.1. As of yesterday, the newest netcdf4-python manylinux wheel contains an older version. The solution is to build netcdf4-python from source. The segv crashes occur with other datasets as well. Example test set I used: ```for year in range(2000, 2005): file = '/tmp/dx{:d}.nc'.format(year) #times = pd.date_range('{:d}-01-01'.format(year), '{:d}-12-31'.format(year), name='time') times = pd.RangeIndex(year, year+300, name='time') v = np.array([np.random.random((32, 32)) for i in range(times.size)]) dx = xr.Dataset({'v': (('time', 'y', 'x'), v)}, {'time': times}) dx.to_netcdf(file, format='NETCDF4', encoding={'time': {'chunksizes': (1024,)}}, unlimited_dims='time') ``` A simple fix is to change the scheduler as I did in my original post. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-437647881,https://api.github.com/repos/pydata/xarray/issues/2554,437647881,MDEyOklzc3VlQ29tbWVudDQzNzY0Nzg4MQ==,40218891,2018-11-11T06:50:22Z,2018-11-11T06:50:22Z,NONE,I meant at random points during execution. The script crashed every time.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-437647777,https://api.github.com/repos/pydata/xarray/issues/2554,437647777,MDEyOklzc3VlQ29tbWVudDQzNzY0Nzc3Nw==,40218891,2018-11-11T06:47:47Z,2018-11-11T06:47:47Z,NONE,"[soundings.zip](https://github.com/pydata/xarray/files/2569126/soundings.zip) I did some further tests, the crash occurs somewhat randomly. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-437646885,https://api.github.com/repos/pydata/xarray/issues/2554,437646885,MDEyOklzc3VlQ29tbWVudDQzNzY0Njg4NQ==,40218891,2018-11-11T06:22:27Z,2018-11-11T06:22:27Z,NONE,"About 600k for 2 files. I could spend some time to try size that down, but if there is a way to upload the the whole set it would be easier for me.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-437633544,https://api.github.com/repos/pydata/xarray/issues/2554,437633544,MDEyOklzc3VlQ29tbWVudDQzNzYzMzU0NA==,40218891,2018-11-11T00:38:03Z,2018-11-11T00:38:03Z,NONE,"Another puzzle, I don't know it is related to the crashes. Trying to localize the issue I added line after `else` on line 453 in netCDF4_.py: `print('=======', name, encoding.get('chunksizes'))` `ds0 = xr.open_dataset('/tmp/nam/bufr.701940/bufr.701940.2010123112.nc')` `ds0.to_netcdf('/tmp/d0.nc')` This prints: ``` ======= hlcy (1, 85) ======= cdbp (1, 85) ======= hovi (1, 85) ======= itim (1024,) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) in () 1 ds0 = xr.open_dataset('/tmp/nam/bufr.701940/bufr.701940.2010123112.nc') ----> 2 ds0.to_netcdf('/tmp/d0.nc') /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute) 1220 engine=engine, encoding=encoding, 1221 unlimited_dims=unlimited_dims, -> 1222 compute=compute) 1223 1224 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None, /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile) 718 # to be parallelized with dask 719 dump_to_store(dataset, store, writer, encoding=encoding, --> 720 unlimited_dims=unlimited_dims) 721 if autoclose: 722 store.close() /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 761 762 store.store(variables, attrs, check_encoding, writer, --> 763 unlimited_dims=unlimited_dims) 764 765 /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 264 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 265 self.set_variables(variables, check_encoding_set, writer, --> 266 unlimited_dims=unlimited_dims) 267 268 def set_attributes(self, attributes): /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 302 check = vn in check_encoding_set 303 target, source = self.prepare_variable( --> 304 name, v, check, unlimited_dims=unlimited_dims) 305 306 writer.add(source, target) /usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 466 least_significant_digit=encoding.get( 467 'least_significant_digit'), --> 468 fill_value=fill_value) 469 _disable_auto_decode_variable(nc4_var) 470 netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable() netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__init__() netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success() RuntimeError: NetCDF: Bad chunk sizes. ``` The dataset is: ``` Dimensions: (dim_1: 1, dim_prof: 60, dim_slyr: 4, ftim: 85, itim: 1) Coordinates: * ftim (ftim) timedelta64[ns] 00:00:00 01:00:00 ... 3 days 12:00:00 * itim (itim) datetime64[ns] 2010-12-31T12:00:00 Dimensions without coordinates: dim_1, dim_prof, dim_slyr Data variables: stnm (dim_1) float64 ... rpid (dim_1) object ... clat (dim_1) float32 ... clon (dim_1) float32 ... gelv (dim_1) float32 ... clss (itim, ftim) float32 ... pres (itim, ftim, dim_prof) float32 ... tmdb (itim, ftim, dim_prof) float32 ... uwnd (itim, ftim, dim_prof) float32 ... vwnd (itim, ftim, dim_prof) float32 ... spfh (itim, ftim, dim_prof) float32 ... omeg (itim, ftim, dim_prof) float32 ... cwtr (itim, ftim, dim_prof) float32 ... dtcp (itim, ftim, dim_prof) float32 ... dtgp (itim, ftim, dim_prof) float32 ... dtsw (itim, ftim, dim_prof) float32 ... dtlw (itim, ftim, dim_prof) float32 ... cfrl (itim, ftim, dim_prof) float32 ... tkel (itim, ftim, dim_prof) float32 ... imxr (itim, ftim, dim_prof) float32 ... pmsl (itim, ftim) float32 ... prss (itim, ftim) float32 ... tmsk (itim, ftim) float32 ... tmin (itim, ftim) float32 ... tmax (itim, ftim) float32 ... wtns (itim, ftim) float32 ... tp01 (itim, ftim) float32 ... c01m (itim, ftim) float32 ... srlm (itim, ftim) float32 ... u10m (itim, ftim) float32 ... v10m (itim, ftim) float32 ... th10 (itim, ftim) float32 ... q10m (itim, ftim) float32 ... t2ms (itim, ftim) float32 ... q2ms (itim, ftim) float32 ... sfex (itim, ftim) float32 ... vegf (itim, ftim) float32 ... cnpw (itim, ftim) float32 ... fxlh (itim, ftim) float32 ... fxlp (itim, ftim) float32 ... fxsh (itim, ftim) float32 ... fxss (itim, ftim) float32 ... fxsn (itim, ftim) float32 ... swrd (itim, ftim) float32 ... swru (itim, ftim) float32 ... lwrd (itim, ftim) float32 ... lwru (itim, ftim) float32 ... lwrt (itim, ftim) float32 ... swrt (itim, ftim) float32 ... snfl (itim, ftim) float32 ... smoi (itim, ftim) float32 ... swem (itim, ftim) float32 ... n01m (itim, ftim) float32 ... r01m (itim, ftim) float32 ... bfgr (itim, ftim) float32 ... sltb (itim, ftim) float32 ... smc1 (itim, ftim, dim_slyr) float32 ... stc1 (itim, ftim, dim_slyr) float32 ... lsql (itim, ftim) float32 ... lcld (itim, ftim) float32 ... mcld (itim, ftim) float32 ... hcld (itim, ftim) float32 ... snra (itim, ftim) float32 ... wxts (itim, ftim) float32 ... wxtp (itim, ftim) float32 ... wxtz (itim, ftim) float32 ... wxtr (itim, ftim) float32 ... ustm (itim, ftim) float32 ... vstm (itim, ftim) float32 ... hlcy (itim, ftim) float32 ... cdbp (itim, ftim) float32 ... hovi (itim, ftim) float32 ... Attributes: model: Unknown ``` ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634 https://github.com/pydata/xarray/issues/2554#issuecomment-437631073,https://api.github.com/repos/pydata/xarray/issues/2554,437631073,MDEyOklzc3VlQ29tbWVudDQzNzYzMTA3Mw==,40218891,2018-11-10T23:49:22Z,2018-11-10T23:49:22Z,NONE,"No, it works fine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634