home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1764262668

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1764262668 I_kwDOAMm_X85pKIMM 7928 .dt accessor returns int instead of float, resulting in misrepresentation of NaT values 12184618 closed 0     12 2023-06-19T21:46:54Z 2023-09-14T13:57:38Z 2023-09-14T13:57:38Z NONE      

What happened?

With the latest xarray (this doesn't happen at least in version 2023.2.0), accessing .dt parts returns a strict int64 DataArray, resulting in wrongly presented missing values.

```python In [2]: s = pd.to_datetime(pd.Series(['2021-12-01', pd.NaT]))

In [3]: s Out[3]: 0 2021-12-01 1 NaT dtype: datetime64[ns]

In [4]: s.dt.year Out[4]: 0 2021.0 1 NaN dtype: float64

In [5]: s.to_xarray() Out[5]: <xarray.DataArray (index: 2)> array(['2021-12-01T00:00:00.000000000', 'NaT'], dtype='datetime64[ns]') Coordinates: * index (index) int64 0 1

In [6]: s.to_xarray().dt.year Out[6]: <xarray.DataArray 'year' (index: 2)> array([ 2021, -9223372036854775808]) Coordinates: * index (index) int64 0 1

```

Notice how: 1. The series and generated DataArray are both correctly datetime64[ns] 2. The series' .dt.year accessor returns a float64 to accommodate the missing value (it will use int32 w/o the missing value). 3. The DataArray's .dt.year returns a negative integer instead of nan.

Additionally, compare with the same snippet's output with xarray 2023.2.0: python3 In [3]: s.to_xarray().dt.year Out [3]: <xarray.DataArray 'year' (index: 2)> array([2021., nan]) Coordinates: * index (index) int64 0 1

What did you expect to happen?

The .dt accessor should return a float with missing values when needed.

Minimal Complete Verifiable Example

```Python import pandas as pd

s = pd.to_datetime(pd.Series(['2021-12-01', pd.NaT])) s.to_xarray().dt.year ```

MVCE confirmation

  • [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [X] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [X] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.19.0-1025-aws machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: C.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: None libnetcdf: None xarray: 2023.5.0 pandas: 2.0.2 numpy: 1.23.4 scipy: 1.10.1 netCDF4: None pydap: None h5netcdf: None h5py: None Nio: None zarr: None cftime: None nc_time_axis: None PseudoNetCDF: None iris: None bottleneck: None dask: 2023.6.0 distributed: 2023.6.0 matplotlib: None cartopy: None seaborn: None numbagg: None fsspec: 2023.6.0 cupy: None pint: None sparse: None flox: None numpy_groupies: None setuptools: 59.6.0 pip: 23.1.2 conda: None pytest: None mypy: None IPython: 8.14.0 sphinx: None
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/7928/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 0 rows from issue in issue_comments
Powered by Datasette · Queries took 0.528ms · About: xarray-datasette