html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/5106#issuecomment-822968365,https://api.github.com/repos/pydata/xarray/issues/5106,822968365,MDEyOklzc3VlQ29tbWVudDgyMjk2ODM2NQ==,40218891,2021-04-20T04:41:07Z,2021-04-20T04:41:07Z,NONE,"I am closing this issue. It is impossible to guess the proper time unit when dealing with missing data. Setting the attribute explicitly is a better solution.
A minor quibble: the statement
``ds1.reftime.encoding['units'] = 'hours since Big Bang'``
rises an exception
``AttributeError: 'NoneType' object has no attribute 'groups'``
It should be
``ValueError: invalid time units: hours since Big Bang``
the same as in the case ``'hours after 1970-01-01'``","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,849751721
https://github.com/pydata/xarray/issues/5106#issuecomment-822108756,https://api.github.com/repos/pydata/xarray/issues/5106,822108756,MDEyOklzc3VlQ29tbWVudDgyMjEwODc1Ng==,40218891,2021-04-19T01:27:02Z,2021-04-19T01:28:38Z,NONE,"When the time dimension of the dataset being appended to is 1, the inferred unit is ""days"". This happens on line 318 in file conding/times.py. In this case variable ``timedeltas`` is an empty array and ``np.all`` evaluates to True:
```
np.all(np.array([]) % 86400000000000 == 0)
True
```
(which surprised me, by the way). When I forced ``_infer_time_units_from_diff`` to return ""hours"", the time coordinate in my example is evaluated correctly, so I think this particular code is the cause for the error.
Since the fallback return value is set to ""seconds"", I would argue that the case of empty ``timedeltas`` should be set to ""seconds"" as well. Are there alternatives or I should go ahead and create a pull request?
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,849751721
https://github.com/pydata/xarray/issues/4830#issuecomment-767170277,https://api.github.com/repos/pydata/xarray/issues/4830,767170277,MDEyOklzc3VlQ29tbWVudDc2NzE3MDI3Nw==,40218891,2021-01-25T23:06:00Z,2021-01-25T23:06:00Z,NONE,"One could always set *source* to ``str(filename_or_object)``. In this case:
```
import s3fs
s3 = s3fs.S3FileSystem(anon=True)
s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1980/WRFDS_1980-01-02.nc'
fileset = s3.open(s3path)
fileset
fileset.path
```
prints
```
'wrf-se-ak-ar5/gfdl/hist/daily/1980/WRFDS_1980-01-02.nc'
```
It is easy to parse the above ``fileset`` representation, but there is no guarantee that some other external file representation will be amenable to parsing.
If the fix is only for *s3fs*, getting ``path`` attribute is more elegant, however this would require xarray to be aware of the module.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,789653499
https://github.com/pydata/xarray/issues/4822#issuecomment-762438483,https://api.github.com/repos/pydata/xarray/issues/4822,762438483,MDEyOklzc3VlQ29tbWVudDc2MjQzODQ4Mw==,40218891,2021-01-18T19:39:34Z,2021-01-18T19:39:34Z,NONE,"You might be right. Adding ``-k nc4`` works when *string* is removed from attribute specification. If it is present, the error is as before: ``AttributeError: 'numpy.ndarray' object has no attribute 'split'``.
However, after changing my AWS script to
```
import s3fs
import xarray as xr
s3 = s3fs.S3FileSystem(anon=True)
s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1988/WRFDS_1988-04-23.nc'
ds = xr.open_dataset(s3.open(s3path), engine='scipy')
print(ds)
```
the error is ``TypeError: Error: None is not a valid NetCDF 3 file``.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436
https://github.com/pydata/xarray/issues/4822#issuecomment-762423707,https://api.github.com/repos/pydata/xarray/issues/4822,762423707,MDEyOklzc3VlQ29tbWVudDc2MjQyMzcwNw==,40218891,2021-01-18T19:03:19Z,2021-01-18T19:03:19Z,NONE,"This is how I did it:
```
$ ncdump /tmp/x.nc
netcdf x {
dimensions:
x = 1 ;
y = 1 ;
variables:
int foo(y, x) ;
foo:coordinates = ""x y"" ;
data:
foo =
0 ;
}
$ rm x.nc
$ ncgen -o x.nc < x.cdl
$ python -c ""import xarray as xr; ds = xr.open_dataset('/tmp/x.nc', engine='h5netcdf'); print(ds)""
```
Engine *netcdf4* works fine, with *string* or without.
My original code retrieving data from AWS:
```
import s3fs
import xarray as xr
s3 = s3fs.S3FileSystem(anon=True)
s3path = 's3://wrf-se-ak-ar5/gfdl/hist/daily/1988/WRFDS_1988-04-23.nc'
ds = xr.open_dataset(s3.open(s3path))
print(ds)
```
Adding ``decode_cf=False`` is a workaround. All attributes are arrays:
```
Attributes:
contact: ['rtladerjr@alaska.edu']
info: ['Alaska CASC']
data: ['Downscaled GFDL-CM3']
format: ['version 2']
date: ['Mon Jul 1 15:17:16 AKDT 2019']
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436
https://github.com/pydata/xarray/issues/4822#issuecomment-762376418,https://api.github.com/repos/pydata/xarray/issues/4822,762376418,MDEyOklzc3VlQ29tbWVudDc2MjM3NjQxOA==,40218891,2021-01-18T17:12:53Z,2021-01-18T17:19:53Z,NONE,Dropping *string* changes error to ``Unable to open file (file signature not found)``. This issue popped up while reading data from https://registry.opendata.aws/wrf-se-alaska-snap/ ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,787947436
https://github.com/pydata/xarray/issues/2871#issuecomment-481033093,https://api.github.com/repos/pydata/xarray/issues/2871,481033093,MDEyOklzc3VlQ29tbWVudDQ4MTAzMzA5Mw==,40218891,2019-04-08T22:35:10Z,2019-04-08T22:35:10Z,NONE,"After rethinking the issue, I would drop it: one can simply pass `ds.fromkeys(ds.data_vars.keys(), {})` as the `encoding` attribute.
Going back to the original problem. The fix above is not enough, the `SerializationWarning` is still present. An alternative, provided that `missing_value` attribute is still considered deprecated: http://cfconventions.org/Data/cf-conventions/cf-conventions-1.1/build/cf-conventions.html#missing-data, would be to replace it by `_FillValue` on decoding:
```
$ diff variables.py variables.py.orig
179,180d178
< if 'FillValue' not in encoding:
< encoding['_FillValue'] = encoding.pop('missing_value')``
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,429914958
https://github.com/pydata/xarray/issues/2871#issuecomment-480475645,https://api.github.com/repos/pydata/xarray/issues/2871,480475645,MDEyOklzc3VlQ29tbWVudDQ4MDQ3NTY0NQ==,40218891,2019-04-06T05:24:52Z,2019-04-06T05:24:52Z,NONE,"Indeed it works. Thanks. My quick fix:
```
$ diff variables.py variables.py.orig
152,155d151
< elif encoding.get('missing_value') is not None:
< fill_value = pop_to(encoding, attrs, 'missing_value', name=name)
< if not pd.isnull(fill_value):
< data = duck_array_ops.fillna(data, fill_value)
```
I also figured out how to write back floating point values: `encoding=None` means use existing values,
so specifying `encoding={'tmpk': {}}` in `to_netcdf()` did the trick. Should there be an option for this? What you see on the screen is not what you get in the file.","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,429914958
https://github.com/pydata/xarray/issues/2554#issuecomment-455351725,https://api.github.com/repos/pydata/xarray/issues/2554,455351725,MDEyOklzc3VlQ29tbWVudDQ1NTM1MTcyNQ==,40218891,2019-01-17T22:13:52Z,2019-01-17T22:13:52Z,NONE,After upgrading to anaconda python 3.7 the code works without crashes. I think this issue can be closed.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-439281383,https://api.github.com/repos/pydata/xarray/issues/2554,439281383,MDEyOklzc3VlQ29tbWVudDQzOTI4MTM4Mw==,40218891,2018-11-16T04:50:43Z,2018-11-16T04:50:43Z,NONE,"The error
`RuntimeError: NetCDF: Bad chunk sizes.`
is unrelated to the original problem with segv crashes. It is caused by a bug in netcdf4 C library. It is fixed in the latest version 4.6.1. As of yesterday, the newest netcdf4-python manylinux wheel contains an older version. The solution is to build netcdf4-python from source.
The segv crashes occur with other datasets as well. Example test set I used:
```for year in range(2000, 2005):
file = '/tmp/dx{:d}.nc'.format(year)
#times = pd.date_range('{:d}-01-01'.format(year), '{:d}-12-31'.format(year), name='time')
times = pd.RangeIndex(year, year+300, name='time')
v = np.array([np.random.random((32, 32)) for i in range(times.size)])
dx = xr.Dataset({'v': (('time', 'y', 'x'), v)}, {'time': times})
dx.to_netcdf(file, format='NETCDF4', encoding={'time': {'chunksizes': (1024,)}},
unlimited_dims='time')
```
A simple fix is to change the scheduler as I did in my original post.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-437647881,https://api.github.com/repos/pydata/xarray/issues/2554,437647881,MDEyOklzc3VlQ29tbWVudDQzNzY0Nzg4MQ==,40218891,2018-11-11T06:50:22Z,2018-11-11T06:50:22Z,NONE,I meant at random points during execution. The script crashed every time.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-437647777,https://api.github.com/repos/pydata/xarray/issues/2554,437647777,MDEyOklzc3VlQ29tbWVudDQzNzY0Nzc3Nw==,40218891,2018-11-11T06:47:47Z,2018-11-11T06:47:47Z,NONE,"[soundings.zip](https://github.com/pydata/xarray/files/2569126/soundings.zip)
I did some further tests, the crash occurs somewhat randomly.
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-437646885,https://api.github.com/repos/pydata/xarray/issues/2554,437646885,MDEyOklzc3VlQ29tbWVudDQzNzY0Njg4NQ==,40218891,2018-11-11T06:22:27Z,2018-11-11T06:22:27Z,NONE,"About 600k for 2 files. I could spend some time to try size that down, but if there is a way to upload the the whole set it would be easier for me.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-437633544,https://api.github.com/repos/pydata/xarray/issues/2554,437633544,MDEyOklzc3VlQ29tbWVudDQzNzYzMzU0NA==,40218891,2018-11-11T00:38:03Z,2018-11-11T00:38:03Z,NONE,"Another puzzle, I don't know it is related to the crashes.
Trying to localize the issue I added line after `else` on line 453 in netCDF4_.py:
`print('=======', name, encoding.get('chunksizes'))`
`ds0 = xr.open_dataset('/tmp/nam/bufr.701940/bufr.701940.2010123112.nc')`
`ds0.to_netcdf('/tmp/d0.nc')`
This prints:
```
======= hlcy (1, 85)
======= cdbp (1, 85)
======= hovi (1, 85)
======= itim (1024,)
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in ()
1 ds0 = xr.open_dataset('/tmp/nam/bufr.701940/bufr.701940.2010123112.nc')
----> 2 ds0.to_netcdf('/tmp/d0.nc')
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute)
1220 engine=engine, encoding=encoding,
1221 unlimited_dims=unlimited_dims,
-> 1222 compute=compute)
1223
1224 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None,
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile)
718 # to be parallelized with dask
719 dump_to_store(dataset, store, writer, encoding=encoding,
--> 720 unlimited_dims=unlimited_dims)
721 if autoclose:
722 store.close()
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims)
761
762 store.store(variables, attrs, check_encoding, writer,
--> 763 unlimited_dims=unlimited_dims)
764
765
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims)
264 self.set_dimensions(variables, unlimited_dims=unlimited_dims)
265 self.set_variables(variables, check_encoding_set, writer,
--> 266 unlimited_dims=unlimited_dims)
267
268 def set_attributes(self, attributes):
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims)
302 check = vn in check_encoding_set
303 target, source = self.prepare_variable(
--> 304 name, v, check, unlimited_dims=unlimited_dims)
305
306 writer.add(source, target)
/usr/local/Python-3.6.5/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims)
466 least_significant_digit=encoding.get(
467 'least_significant_digit'),
--> 468 fill_value=fill_value)
469 _disable_auto_decode_variable(nc4_var)
470
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Dataset.createVariable()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4.Variable.__init__()
netCDF4/_netCDF4.pyx in netCDF4._netCDF4._ensure_nc_success()
RuntimeError: NetCDF: Bad chunk sizes.
```
The dataset is:
```
Dimensions: (dim_1: 1, dim_prof: 60, dim_slyr: 4, ftim: 85, itim: 1)
Coordinates:
* ftim (ftim) timedelta64[ns] 00:00:00 01:00:00 ... 3 days 12:00:00
* itim (itim) datetime64[ns] 2010-12-31T12:00:00
Dimensions without coordinates: dim_1, dim_prof, dim_slyr
Data variables:
stnm (dim_1) float64 ...
rpid (dim_1) object ...
clat (dim_1) float32 ...
clon (dim_1) float32 ...
gelv (dim_1) float32 ...
clss (itim, ftim) float32 ...
pres (itim, ftim, dim_prof) float32 ...
tmdb (itim, ftim, dim_prof) float32 ...
uwnd (itim, ftim, dim_prof) float32 ...
vwnd (itim, ftim, dim_prof) float32 ...
spfh (itim, ftim, dim_prof) float32 ...
omeg (itim, ftim, dim_prof) float32 ...
cwtr (itim, ftim, dim_prof) float32 ...
dtcp (itim, ftim, dim_prof) float32 ...
dtgp (itim, ftim, dim_prof) float32 ...
dtsw (itim, ftim, dim_prof) float32 ...
dtlw (itim, ftim, dim_prof) float32 ...
cfrl (itim, ftim, dim_prof) float32 ...
tkel (itim, ftim, dim_prof) float32 ...
imxr (itim, ftim, dim_prof) float32 ...
pmsl (itim, ftim) float32 ...
prss (itim, ftim) float32 ...
tmsk (itim, ftim) float32 ...
tmin (itim, ftim) float32 ...
tmax (itim, ftim) float32 ...
wtns (itim, ftim) float32 ...
tp01 (itim, ftim) float32 ...
c01m (itim, ftim) float32 ...
srlm (itim, ftim) float32 ...
u10m (itim, ftim) float32 ...
v10m (itim, ftim) float32 ...
th10 (itim, ftim) float32 ...
q10m (itim, ftim) float32 ...
t2ms (itim, ftim) float32 ...
q2ms (itim, ftim) float32 ...
sfex (itim, ftim) float32 ...
vegf (itim, ftim) float32 ...
cnpw (itim, ftim) float32 ...
fxlh (itim, ftim) float32 ...
fxlp (itim, ftim) float32 ...
fxsh (itim, ftim) float32 ...
fxss (itim, ftim) float32 ...
fxsn (itim, ftim) float32 ...
swrd (itim, ftim) float32 ...
swru (itim, ftim) float32 ...
lwrd (itim, ftim) float32 ...
lwru (itim, ftim) float32 ...
lwrt (itim, ftim) float32 ...
swrt (itim, ftim) float32 ...
snfl (itim, ftim) float32 ...
smoi (itim, ftim) float32 ...
swem (itim, ftim) float32 ...
n01m (itim, ftim) float32 ...
r01m (itim, ftim) float32 ...
bfgr (itim, ftim) float32 ...
sltb (itim, ftim) float32 ...
smc1 (itim, ftim, dim_slyr) float32 ...
stc1 (itim, ftim, dim_slyr) float32 ...
lsql (itim, ftim) float32 ...
lcld (itim, ftim) float32 ...
mcld (itim, ftim) float32 ...
hcld (itim, ftim) float32 ...
snra (itim, ftim) float32 ...
wxts (itim, ftim) float32 ...
wxtp (itim, ftim) float32 ...
wxtz (itim, ftim) float32 ...
wxtr (itim, ftim) float32 ...
ustm (itim, ftim) float32 ...
vstm (itim, ftim) float32 ...
hlcy (itim, ftim) float32 ...
cdbp (itim, ftim) float32 ...
hovi (itim, ftim) float32 ...
Attributes:
model: Unknown
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634
https://github.com/pydata/xarray/issues/2554#issuecomment-437631073,https://api.github.com/repos/pydata/xarray/issues/2554,437631073,MDEyOklzc3VlQ29tbWVudDQzNzYzMTA3Mw==,40218891,2018-11-10T23:49:22Z,2018-11-10T23:49:22Z,NONE,"No, it works fine.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,379472634