home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1878016712

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1878016712 I_kwDOAMm_X85v8ELI 8137 `time` variable encoding changes upon using `to_netcdf` method on a `DataSet` 50383939 open 0     2 2023-09-01T20:34:58Z 2023-09-15T05:32:15Z   NONE      

What is your issue?

Upon trying to use the to_netcdf method of the Dataset, the encoding (local attributes) of the time variable changes. More specifically, the units has changed into another format. Here is a reproducible example:

```python-console $ ipython Python 3.10.2 (main, Feb 4 2022, 19:10:35) [GCC 9.3.0] Type 'copyright', 'credits' or 'license' for more information IPython 8.10.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import xarray as xr imp
In [2]: import numpy as np

In [3]: import pandas as pd

In [4]: np.random.seed(0) ...: temperature = 15 + 8 * np.random.randn(2, 2, 25) ...: precipitation = 10 * np.random.rand(2, 2, 25) ...: lon = [[-99.83, -99.32], [-99.79, -99.23]] ...: lat = [[42.25, 42.21], [42.63, 42.59]] ...: time = pd.date_range("2014-09-06", "2014-09-07",freq='H') ...: reference_time = pd.Timestamp("2014-09-05")

In [5]: ds = xr.Dataset( ...: data_vars=dict( ...: temperature=(["x", "y", "time"], temperature), ...: precipitation=(["x", "y", "time"], precipitation), ...: ), ...: coords=dict( ...: lon=(["x", "y"], lon), ...: lat=(["x", "y"], lat), ...: time=time, ...: reference_time=reference_time, ...: ), ...: attrs=dict(description="Weather related data."), ...: ) ...: ds Out[5]: <xarray.Dataset> Dimensions: (x: 2, y: 2, time: 25) Coordinates: lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 lat (x, y) float64 42.25 42.21 42.63 42.59 * time (time) datetime64[ns] 2014-09-06 ... 2014-09-07 reference_time datetime64[ns] 2014-09-05 Dimensions without coordinates: x, y Data variables: temperature (x, y, time) float64 29.11 18.2 22.83 ... 29.29 16.02 18.22 precipitation (x, y, time) float64 4.239 6.064 0.1919 ... 8.727 2.735 7.98 Attributes: description: Weather related data.

In [6]: ds.time Out[6]: <xarray.DataArray 'time' (time: 25)> array(['2014-09-06T00:00:00.000000000', '2014-09-06T01:00:00.000000000', '2014-09-06T02:00:00.000000000', '2014-09-06T03:00:00.000000000', '2014-09-06T04:00:00.000000000', '2014-09-06T05:00:00.000000000', '2014-09-06T06:00:00.000000000', '2014-09-06T07:00:00.000000000', '2014-09-06T08:00:00.000000000', '2014-09-06T09:00:00.000000000', '2014-09-06T10:00:00.000000000', '2014-09-06T11:00:00.000000000', '2014-09-06T12:00:00.000000000', '2014-09-06T13:00:00.000000000', '2014-09-06T14:00:00.000000000', '2014-09-06T15:00:00.000000000', '2014-09-06T16:00:00.000000000', '2014-09-06T17:00:00.000000000', '2014-09-06T18:00:00.000000000', '2014-09-06T19:00:00.000000000', '2014-09-06T20:00:00.000000000', '2014-09-06T21:00:00.000000000', '2014-09-06T22:00:00.000000000', '2014-09-06T23:00:00.000000000', '2014-09-07T00:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2014-09-06 ... 2014-09-07 reference_time datetime64[ns] 2014-09-05

In [7]: ds.time.encoding Out[7]: {}

In [9]: ds.to_netcdf("./test.nc", encoding={'time': {'units': 'hours since 2014-09-01 12:00:00'}})

In [10]: !ncdump -h ./test.nc netcdf test { dimensions: x = 2 ; y = 2 ; time = 25 ; variables: double temperature(x, y, time) ; temperature:_FillValue = NaN ; temperature:coordinates = "lat lon reference_time" ; double precipitation(x, y, time) ; precipitation:_FillValue = NaN ; precipitation:coordinates = "lat lon reference_time" ; double lon(x, y) ; lon:_FillValue = NaN ; double lat(x, y) ; lat:_FillValue = NaN ; int64 time(time) ; time:units = "hours since 2014-09-01T12:00:00" ; <------- this is the problem time:calendar = "proleptic_gregorian" ; int64 reference_time ; reference_time:units = "days since 2014-09-05 00:00:00" ; reference_time:calendar = "proleptic_gregorian" ;

// global attributes: :description = "Weather related data." ; }

In [11]: ds.info() xarray.Dataset { dimensions: x = 2 ; y = 2 ; time = 25 ;

variables: float64 temperature(x, y, time) ; float64 precipitation(x, y, time) ; float64 lon(x, y) ; float64 lat(x, y) ; datetime64[ns] time(time) ; datetime64[ns] reference_time() ;

// global attributes: :description = Weather related data. ; } ```

The only thing that I am concerned about is the T value in the "hours since 2014-09-01T12:00:00" string in the final netCDF file. I would like to have control over it, however, even by providing an encoding dictionary for the units attribute, the T is placed in the attribute string.

The sample dataset is taken from here: https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html

How may I evade this issue? Any suggestions. I did my best to Google. Thanks.

{
    "url": "https://api.github.com/repos/pydata/xarray/issues/8137/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.596ms · About: xarray-datasette