home / github / issues

Menu
  • Search all tables
  • GraphQL API

issues: 1044693438

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1044693438 I_kwDOAMm_X84-RMG- 5937 DataArray.dt.seconds returns incorrect value for negative `timedelta64[ns]` 2405019 closed 0     4 2021-11-04T12:05:24Z 2023-11-10T00:39:17Z 2023-11-10T00:39:17Z CONTRIBUTOR      

What happened:

For a negative timedelta64[ns] of 42 nanoseconds DataArray.dt.seconds returned a non-zero value (the returned value was 86399). When I pass in a positive 42 nanosecond timedelta64[ns] with the the TimeDeltaAccessor correctly returns zero. I would have expected both assertions in the example below to have passed, but the second fails. This seems to be a general issue with negative timedelta64[ns].

bash <xarray.DataArray 'seconds' (dim_0: 1)> array([0]) Dimensions without coordinates: dim_0 <xarray.DataArray 'seconds' (dim_0: 1)> array([86399]) Dimensions without coordinates: dim_0 Traceback (most recent call last): File "bug_dt_seconds.py", line 15, in <module> assert da.dt.seconds == 0 AssertionError

What you expected to happen: bash <xarray.DataArray 'seconds' (dim_0: 1)> array([0]) Dimensions without coordinates: dim_0 <xarray.DataArray 'seconds' (dim_0: 1)> array([0]) Dimensions without coordinates: dim_0

Minimal Complete Verifiable Example:

```python

coding: utf-8

import xarray as xr import numpy as np

number of nanoseconds

value = 42

da = xr.DataArray([np.timedelta64(value, "ns")]) print(da.dt.seconds) assert da.dt.seconds == 0

da = xr.DataArray([np.timedelta64(-value, "ns")]) print(da.dt.seconds) assert da.dt.seconds == 0 ```

Anything else we need to know?:

I've narrowed this down to the call to pd.Series(values.ravel()) in xarray.core.accessor_dt._access_through_series:

python ipdb> pd.Series(values.ravel()) 0 -1 days +23:59:59.999999958 dtype: timedelta64[ns]

I think the issue arises because pandas turns the numpy timedelta64 into a "minus one day plus a time". This actually does have a number of "seconds" in it, but the "total_seconds" has the expected value:

python ipdb> pd.Series(values.ravel()).dt.total_seconds() 0 -4.200000e-08 dtype: float64

Which would correctly round to zero.

I don't think the issue is in pandas, although the output from pandas is counter-intuitive:

python ipdb> pd.Series(values.ravel()).dt.seconds 0 86399 dtype: int64

Maybe we should handle this as a special case by taking the absolute value before passing the values to pandas (and then applying the original sign again afterwards)?

Environment:

Output of <tt>xr.show_versions()</tt> ``` INSTALLED VERSIONS ------------------ commit: None python: 3.7.7 (default, May 6 2020, 04:59:01) [Clang 4.0.1 (tags/RELEASE_401/final)] python-bits: 64 OS: Darwin OS-release: 19.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: en_GB.UTF-8 LANG: None LOCALE: ('en_GB', 'UTF-8') libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.18.2 pandas: 1.3.4 numpy: 1.19.1 scipy: 1.5.0 netCDF4: 1.4.2 pydap: installed h5netcdf: None h5py: 2.9.0 Nio: None zarr: 2.10.1 cftime: 1.5.1.1 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2021.09.1 distributed: 2021.09.1 matplotlib: 3.2.2 cartopy: 0.18.0 seaborn: 0.10.1 numbagg: None fsspec: 2021.06.1 cupy: None pint: 0.18 sparse: None setuptools: 46.4.0.post20200518 pip: 21.1.2 conda: None pytest: 6.0.1 IPython: 7.16.1 sphinx: None ```
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/5937/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 1 row from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.757ms · About: xarray-datasette