home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 437418525

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
437418525 MDU6SXNzdWU0Mzc0MTg1MjU= 2921 to_netcdf with decoded time can create file with inconsistent time:units and time_bounds:units 15570875 closed 0     4 2019-04-25T22:08:52Z 2019-06-25T00:24:42Z 2019-06-25T00:24:42Z NONE      

Code Sample, a copy-pastable example if possible

```python import numpy as np import xarray as xr

create time and time_bounds DataArrays for Jan-1850 and Feb-1850

time_bounds_vals = np.array([[0.0, 31.0], [31.0, 59.0]]) time_vals = time_bounds_vals.mean(axis=1)

time_var = xr.DataArray(time_vals, dims='time', coords={'time':time_vals}) time_bounds_var = xr.DataArray(time_bounds_vals, dims=('time', 'd2'), coords={'time':time_vals})

create Dataset of time and time_bounds

ds = xr.Dataset(coords={'time':time_var}, data_vars={'time_bounds':time_bounds_var}) ds.time.attrs = {'bounds':'time_bounds', 'calendar':'noleap', 'units':'days since 1850-01-01'}

write Jan-1850 values to file

ds.isel(time=slice(0,1)).to_netcdf('Jan-1850.nc', unlimited_dims='time')

write Feb-1850 values to file

ds.isel(time=slice(1,2)).to_netcdf('Feb-1850.nc', unlimited_dims='time')

use open_mfdataset to read in files, combining into 1 Dataset

decode_times = True decode_cf = True ds = xr.open_mfdataset(['Jan-1850.nc', 'Feb-1850.nc'], decode_cf=decode_cf, decode_times=decode_times)

write combined Dataset out

ds.to_netcdf('JanFeb-1850.nc', unlimited_dims='time') ```

Problem description

The above code initially creates 2 netCDF files, for Jan-1850 and Feb-1850, that have the variables time and time_bounds, and time:bounds='time_bounds'. It then reads the 2 files back in as a single Dataset, using open_mfdataset, and this Dataset is written back out to a netCDF file. ncdump of this final file is ``` netcdf JanFeb-1850 { dimensions: time = UNLIMITED ; // (2 currently) d2 = 2 ; variables: int64 time(time) ; time:bounds = "time_bounds" ; time:units = "hours since 1850-01-16 12:00:00.000000" ; time:calendar = "noleap" ; double time_bounds(time, d2) ; time_bounds:_FillValue = NaN ; time_bounds:units = "days since 1850-01-01" ; time_bounds:calendar = "noleap" ; data:

time = 0, 708 ;

time_bounds = 0, 31, 31, 59 ; } `` The problem is that the units attribute fortimeandtime_bounds` are different in this file, contrary to what CF conventions requires.

The final call to to_netcdf is creating a file where time's units (and type) differ from what they are in the intermediate files. These transformations are not being applied to time_bounds.

While the change to time's type is not necessarily an issue, I do find it surprising.

This inconsistency goes away if either of decode_times or decode_cf is set to False in the python code above. In particular, the transformations to time's units and type do not happen.

The inconsistency also goes away if open_mfdataset opens a single file. In this scenario also, the transformations to time's units and type do not happen.

I think that the desired behavior is to either not apply the units and type transformations to time, or to also apply them to time_bounds. The first option would be consistent with the current single-file behavior.

INSTALLED VERSIONS ------------------ commit: None python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 3.12.62-60.64.8-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.12.1 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudonetCDF: None rasterio: None cfgrib: None iris: None bottleneck: None dask: 1.1.5 distributed: 1.26.1 matplotlib: 3.0.3 cartopy: None seaborn: None setuptools: 40.8.0 pip: 19.0.3 conda: None pytest: 4.3.1 IPython: 7.4.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/2921/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 0.889ms · About: xarray-datasette