home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 490618213

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
490618213 MDU6SXNzdWU0OTA2MTgyMTM= 3291 xr.DataSet.from_dataframe / xr.DataArray.from_series does not preserve DateTimeIndex with timezone 923438 open 0     4 2019-09-07T10:10:40Z 2021-04-21T21:00:41Z   NONE      

Problem Description

When using DataSet.from_dataframe (DataArray.from_series) to convert a pandas dataframe with DateTimeIndex having a timezone - xarray convert the datetime into a nanosecond index - rather than keeping it as a datetime-index type.

MCVE Code Sample

python print( df.index ) DatetimeIndex(['2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', '2000-01-03 16:00:00-05:00', ... '2019-08-20 16:00:00-05:00', '2019-08-20 16:00:00-05:00'], dtype='datetime64[ns, EST]', name='time', length=12713014, freq=None) python ds = xr.DataSet.from_dataframe( df.head( 1000 ) ) print( ds['time'] ) <xarray.DataArray 'time' (time: 7)> array([946933200000000000, 947019600000000000, 947106000000000000, 947192400000000000, 947278800000000000, 947538000000000000, 947624400000000000, ...], dtype=object) Coordinates: * time (time) object 946933200000000000 ... 947624400000000000

Expected Output

After removing the tz localization from the DateTimeIndex of the dataframe , the conversion to a DataSet preserves the time-index (without converting it to nanoseconds)

python df.index = df.index.tz_convert('UTC').tz_localize(None) ds = xr.DataSet.from_dataframe( df.head(1000) ) print( ds['time] ) <xarray.DataArray 'time' (time: 7)> array(['2000-01-03T21:00:00.000000000', '2000-01-04T21:00:00.000000000', '2000-01-05T21:00:00.000000000', '2000-01-06T21:00:00.000000000', '2000-01-07T21:00:00.000000000', '2000-01-10T21:00:00.000000000', '2000-01-11T21:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * time (time) datetime64[ns] 2000-01-03T21:00:00 ... 2000-01-11T21:00:00

Output of xr.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.9.0-9-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.4 libnetcdf: None xarray: 0.12.3+81.g41fecd86 pandas: 0.24.2 numpy: 1.16.2 scipy: 1.2.1 netCDF4: None pydap: None h5netcdf: None h5py: 2.9.0 Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 1.1.4 distributed: 1.26.0 matplotlib: 3.0.3 cartopy: None seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.0.3 conda: 4.7.11 pytest: 4.3.1 IPython: 7.4.0 sphinx: 1.8.5
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/3291/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
    13221727 issue

Links from other tables

  • 0 rows from issues_id in issues_labels
  • 4 rows from issue in issue_comments
Powered by Datasette · Queries took 5.112ms · About: xarray-datasette