home / github

Menu
  • Search all tables
  • GraphQL API

issues

Table actions
  • GraphQL API for issues

2 rows where type = "issue" and user = 33062222 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date), closed_at (date)

type 1

  • issue · 2 ✖

state 1

  • closed 2

repo 1

  • xarray 2
id node_id number title user state locked assignee milestone comments created_at updated_at ▲ closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
558293655 MDU6SXNzdWU1NTgyOTM2NTU= 3739 ValueError when trying to encode time variable in a NetCDF file with CF convensions avatar101 33062222 closed 0     7 2020-01-31T18:22:36Z 2023-09-13T13:45:47Z 2023-09-13T13:45:46Z NONE      

```python

Imports

import numpy as np import xarray as xr import pandas as pd from glob import glob

files to be concatenated

files = sorted(glob(path + str(1988) + '/V250*'))

corrected dates

dates = pd.date_range(start=str(yr), end=str(yr+1), freq='6H', closed='left')

ds_test = xr.open_mfdataset(files[:10], combine='nested', concat_dim='time', decode_cf=False)

correcting time

ds_test.time.values=dates[:10]

fixing encoding

ds_test.time.attrs['units'] = "Seconds since 1970-01-01 00:00:00"

preview of the time variable

print(ds_test.time)

<xarray.DataArray 'time' (time: 10)> array(['1988-01-01T00:00:00.000000000', '1988-01-01T06:00:00.000000000', '1988-01-01T12:00:00.000000000', '1988-01-01T18:00:00.000000000', '1988-01-02T00:00:00.000000000', '1988-01-02T06:00:00.000000000', '1988-01-02T12:00:00.000000000', '1988-01-02T18:00:00.000000000', '1988-01-03T00:00:00.000000000', '1988-01-03T06:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 1988-01-01 ... 1988-01-03T06:00:00 Attributes: calendar: proleptic_gregorian standard_name: time units: Seconds since 1970-01-01 00:00:00

ds_test.to_netcdf(path+'test.nc')

ValueError: failed to prevent overwriting existing key units in attrs on variable 'time'. This is probably an encoding field used by xarray to describe how a variable is serialized. To proceed, remove this key from the variable's attributes manually.

```

Expected Output

Correctly encode time such that it saves the file by correctly converting value of time according to the reference units. I have the flexibility of dropping CF-conventions as long as time values are correct but it would also be nice to have a solution which keeps the CF-conventions intact.

Problem Description

I'm trying to concatenate netcdf files which have CF conventions mentioned in their global attributes. These files have an incorrect time dimension which I try to fix with the code above. It seems that some existing encoding is preventing from writing the files back. But when I print the encoding, it doesn't show any such clashing units. I'm not sure if this is a bug or a wrong usage issue. Thus, any usage help on how to correctly encode time such that it saves the time values by correctly converting according to the reference units is much appreciated.

```python

More diagnostics on the encoding

print(ds_test.encoding)

{'unlimited_dims': {'time'}, 'source': '/file/to/path/V250_19880101_00'}

checking any existing time

print(ds_test.time.encoding)

{}

another try on setting time encoding

ds_test.time.encoding['units'] = "Seconds since 1970-01-01 00:00:00"

writing the file gives the same ValueError as above

ds_test.to_netcdf(path+'test.nc')

ncdump output of one of the files

netcdf V250_19880101_06 { dimensions: lon = 720 ; lat = 361 ; lev = 1 ; time = UNLIMITED ; // (1 currently) variables: float lon(lon) ; lon:long_name = "longitude" ; lon:units = "degrees_east" ; lon:standard_name = "longitude" ; lon:axis = "X" ; float lat(lat) ; lat:long_name = "latitude" ; lat:units = "degrees_north" ; lat:standard_name = "latitude" ; lat:axis = "Y" ; float lev(lev) ; lev:long_name = "hybrid level at layer midpoints" ; lev:units = "level" ; lev:standard_name = "hybrid_sigma_pressure" ; lev:positive = "down" ; lev:formula = "hyam hybm (mlev=hyam+hybm*aps)" ; lev:formula_terms = "ap: hyam b: hybm ps: aps" ; float time(time) ; time:units = "hours since 1988-01-01 06:00:00" ; time:calendar = "proleptic_gregorian" ; time:standard_name = "time" ; float V(time, lev, lat, lon) ; V:long_name = "unknown (please add with NCO)" ; V:units = "unknown (please add with NCO)" ; V:_FillValue = -999.99f ;

// global attributes: :Conventions = "CF" ; :constants_file_name = "P19880101_06" ; :institution = "IACETH" ; :lonmin = -180.f ; :lonmax = 179.5f ; :latmin = -90.f ; :latmax = 90.f ; :levmin = 250.f ; :levmax = 250.f ; :history = "Fri Sep 6 15:59:17 2019: ncatted -a units,time,o,c,hours since 1988-01-01 06:00:00 -a standard_name,time,o,c,time V250_19880101_06" ; :NCO = "4.7.2" ; data:

time = 6 ; }

```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 5.0.0-23-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.1 xarray: 0.13.0 pandas: 0.25.3 numpy: 1.18.1 scipy: 1.3.2 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.4.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 2.9.2 distributed: 2.9.3 matplotlib: 3.1.0 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 44.0.0.post20200106 pip: 19.3.1 conda: None pytest: None IPython: 7.11.1 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3739/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue
408426920 MDU6SXNzdWU0MDg0MjY5MjA= 2758 Dataset.to_netcdf() results in unexpected encoding parameters for 'netCDF4' backend avatar101 33062222 closed 0     2 2019-02-09T12:25:54Z 2019-02-11T09:59:06Z 2019-02-11T09:59:06Z NONE      

``` import pandas as pd import xarray as xr from datetime import datetime

ds_test2=xr.open_dataset('test_file.nc') ```

ncudmp to show how file looks like

<test_file> netcdf test_file { dimensions: lon = 720 ; lev = 1 ; time = 27147 ; variables: float lon(lon) ; lon:_FillValue = NaNf ; float lev(lev) ; lev:_FillValue = NaNf ; lev:long_name = "hybrid level at layer midpoints" ; lev:units = "level" ; lev:standard_name = "hybrid_sigma_pressure" ; lev:positive = "down" ; lev:formula = "hyam hybm (mlev=hyam+hybm*aps)" ; lev:formula_terms = "ap: hyam b: hybm ps: aps" ; int64 time(time) ; time:units = "hours since 2000-01-01 00:00:00" ; time:calendar = "proleptic_gregorian" ; float V(time, lev, lon) ; V:_FillValue = NaNf ; V:units = "m/s" ; }

</test_file>

``` std_time = datetime(1970,1,1) timedata = pd.to_datetime(ds_test2.time.values).to_pydatetime() timedata_updated = [(t - std_time).total_seconds() for t in timedata] ds_test2.time.values= timedata_updated ds_test2.time.attrs['units'] = 'Seconds since 01-01-1970 00:00:00 UTC'

saving file

ds_test2.to_netcdf('/scratch3/mali/data/test/test_V250hov_encoding4_v2.nc', encoding={'V':{'_FillValue': -999.0},'time':{'units': "seconds since 1970-01-01 00:00:00"}})```


ValueError Traceback (most recent call last) <ipython-input-26-04662c00dfc2> in <module> 6 # saving file to netcdf for one combined hov dataset 7 ds_test2.to_netcdf('/scratch3/mali/data/test/test_V250hov_encoding4_v2.nc', ----> 8 encoding={'V':{'_FillValue': -999.0},'time':{'units': "seconds since 1970-01-01 00:00:00"}})

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/core/dataset.py in to_netcdf(self, path, mode, format, group, engine, encoding, unlimited_dims, compute) 1220 engine=engine, encoding=encoding, 1221 unlimited_dims=unlimited_dims, -> 1222 compute=compute) 1223 1224 def to_zarr(self, store=None, mode='w-', synchronizer=None, group=None,

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/api.py in to_netcdf(dataset, path_or_file, mode, format, group, engine, encoding, unlimited_dims, compute, multifile) 716 # to be parallelized with dask 717 dump_to_store(dataset, store, writer, encoding=encoding, --> 718 unlimited_dims=unlimited_dims) 719 if autoclose: 720 store.close()

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/api.py in dump_to_store(dataset, store, writer, encoder, encoding, unlimited_dims) 759 760 store.store(variables, attrs, check_encoding, writer, --> 761 unlimited_dims=unlimited_dims) 762 763

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/common.py in store(self, variables, attributes, check_encoding_set, writer, unlimited_dims) 264 self.set_dimensions(variables, unlimited_dims=unlimited_dims) 265 self.set_variables(variables, check_encoding_set, writer, --> 266 unlimited_dims=unlimited_dims) 267 268 def set_attributes(self, attributes):

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/common.py in set_variables(self, variables, check_encoding_set, writer, unlimited_dims) 302 check = vn in check_encoding_set 303 target, source = self.prepare_variable( --> 304 name, v, check, unlimited_dims=unlimited_dims) 305 306 writer.add(source, target)

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in prepare_variable(self, name, variable, check_encoding, unlimited_dims) 448 encoding = _extract_nc4_variable_encoding( 449 variable, raise_on_invalid=check_encoding, --> 450 unlimited_dims=unlimited_dims) 451 if name in self.ds.variables: 452 nc4_var = self.ds.variables[name]

/usr/local/anaconda3/envs/work_env/lib/python3.6/site-packages/xarray/backends/netCDF4_.py in _extract_nc4_variable_encoding(variable, raise_on_invalid, lsd_okay, h5py_okay, backend, unlimited_dims) 223 if invalid: 224 raise ValueError('unexpected encoding parameters for %r backend: ' --> 225 ' %r' % (backend, invalid)) 226 else: 227 for k in list(encoding):

ValueError: unexpected encoding parameters for 'netCDF4' backend: ['units'] ```

Problem description

I'm trying to change the time attributes becaues in the workflow there are some scripts which are not in python and would like the time to start from a specific year. I've written the code to calculate seconds from a specific standard time. Later on, I realised that I don't need to do that as xarray takes care of that when saving the data when specified with that encoding parameter. Strange thing is the writng the file by above approach is giving me an error however, when I just read in the same file and save it with the same encoding as above and without changing the time values manually, it works fine. Here's what I mean:

``` ds_test3 = xr.open_dataset('test_file.nc') ## same file as before

saving directly without doing any calculations like before

ds_test3.to_netcdf('/scratch3/mali/data/test/test_V250hov_encoding4_v2.nc', encoding={'V':{'_FillValue': -999.0},'time':{'units': "seconds since 1970-01-01 00:00:00"}}

above code works fine

```

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.6.6.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-45-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 xarray: 0.11.0 pandas: 0.23.4 numpy: 1.15.2 scipy: 1.1.0 netCDF4: 1.4.2 h5netcdf: None h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.0b1 PseudonetCDF: None rasterio: None iris: 2.2.0 bottleneck: 1.2.1 cyordereddict: None dask: 0.19.2 distributed: 1.23.2 matplotlib: 2.2.2 cartopy: 0.16.0 seaborn: 0.9.0 setuptools: 40.4.3 pip: 10.0.1 conda: None pytest: 3.8.1 IPython: 7.0.1 sphinx: 1.8.1

I know that I can change the time to a standard calendar without performing the manual calculations I did but I would like to know how are my calculations modifying the dataset such that netcdf4 backend doesn't recognise ['units'] of time anymore?

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2758/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed xarray 13221727 issue

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issues] (
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [number] INTEGER,
   [title] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [state] TEXT,
   [locked] INTEGER,
   [assignee] INTEGER REFERENCES [users]([id]),
   [milestone] INTEGER REFERENCES [milestones]([id]),
   [comments] INTEGER,
   [created_at] TEXT,
   [updated_at] TEXT,
   [closed_at] TEXT,
   [author_association] TEXT,
   [active_lock_reason] TEXT,
   [draft] INTEGER,
   [pull_request] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [state_reason] TEXT,
   [repo] INTEGER REFERENCES [repos]([id]),
   [type] TEXT
);
CREATE INDEX [idx_issues_repo]
    ON [issues] ([repo]);
CREATE INDEX [idx_issues_milestone]
    ON [issues] ([milestone]);
CREATE INDEX [idx_issues_assignee]
    ON [issues] ([assignee]);
CREATE INDEX [idx_issues_user]
    ON [issues] ([user]);
Powered by Datasette · Queries took 1203.584ms · About: xarray-datasette