html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1662#issuecomment-340008407,https://api.github.com/repos/pydata/xarray/issues/1662,340008407,MDEyOklzc3VlQ29tbWVudDM0MDAwODQwNw==,1956032,2017-10-27T15:44:11Z,2017-10-27T15:44:11Z,CONTRIBUTOR,"Note that if the xarray decode_cf is given a NaT, in a datetime64, it works:
```python
attrs = {'units': 'days since 1950-01-01 00:00:00 UTC'} # Classic Argo data Julian Day reference
jd = [24658.46875, 24658.46366898, 24658.47256944, np.NaN] # Sample
def dirtyfixNaNjd(ref,day):
td = pd.NaT
if not np.isnan(day):
td = pd.Timedelta(days=day)
return pd.Timestamp(ref) + td
jd = [dirtyfixNaNjd('1950-01-01',day) for day in jd]
print jd
```
```python
[Timestamp('2017-07-06 11:15:00'), Timestamp('2017-07-06 11:07:40.999872'), Timestamp('2017-07-06 11:20:29.999616'), NaT]
```
then:
```python
ds = xr.Dataset({'time': ('time', jd, {'units': 'ns'})}) # Update the units attribute appropriately
ds = xr.decode_cf(ds)
print ds['time'].values
```
```python
['2017-07-06T11:15:00.000000000' '2017-07-06T11:07:40.999872000'
'2017-07-06T11:20:29.999616000' 'NaT']
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268725471
https://github.com/pydata/xarray/issues/1662#issuecomment-339991529,https://api.github.com/repos/pydata/xarray/issues/1662,339991529,MDEyOklzc3VlQ29tbWVudDMzOTk5MTUyOQ==,1956032,2017-10-27T14:42:56Z,2017-10-27T14:42:56Z,CONTRIBUTOR,"Hi Ryan, never been very far, following/promoting xarray around here, and congrats for Pangeo !
Ok, I get the datatype being wrong, but about the issue from pandas TimedeltaIndex:
Does this means that a quick/dirty fix should be to decode value by value rather than on a vector ?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268725471
https://github.com/pydata/xarray/issues/1662#issuecomment-339730597,https://api.github.com/repos/pydata/xarray/issues/1662,339730597,MDEyOklzc3VlQ29tbWVudDMzOTczMDU5Nw==,1217238,2017-10-26T16:56:03Z,2017-10-26T16:56:03Z,MEMBER,"I'm pretty sure this used to work in some form. I definitely worked with a dataset in the infancy of xarray that had coordinates with missing times.
The current issue appears to be that pandas represents the `NaT` values as an integer, and then (predictably) suffers from numeric overflow:
```
In [8]: import pandas as pd
In [9]: pd.to_timedelta(['24658 days 11:15:00', 'NaT']) + pd.Timestamp('1950-01-01')
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
in ()
----> 1 pd.to_timedelta(['24658 days 11:15:00', 'NaT']) + pd.Timestamp('1950-01-01')
~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/indexes/datetimelike.py in __add__(self, other)
658 return self.shift(other)
659 elif isinstance(other, (Timestamp, datetime)):
--> 660 return self._add_datelike(other)
661 else: # pragma: no cover
662 return NotImplemented
~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/indexes/timedeltas.py in _add_datelike(self, other)
354 other = Timestamp(other)
355 i8 = self.asi8
--> 356 result = checked_add_with_arr(i8, other.value)
357 result = self._maybe_mask_results(result, fill_value=iNaT)
358 return DatetimeIndex(result, name=self.name, copy=False)
~/conda/envs/xarray-py36/lib/python3.6/site-packages/pandas/core/algorithms.py in checked_add_with_arr(arr, b, arr_mask, b_mask)
889
890 if to_raise:
--> 891 raise OverflowError(""Overflow in int64 addition"")
892 return arr + b
893
OverflowError: Overflow in int64 addition
```
This appears to be specific to our use of a `TimedeltaIndex`. Overflow doesn't appear if you add either value as scalars:
```
In [11]: pd.NaT + pd.Timestamp('1950-01-01')
Out[11]: NaT
In [12]: pd.Timedelta('24658 days 11:15:00') + pd.Timestamp('1950-01-01')
Out[12]: Timestamp('2017-07-06 11:15:00')
```
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268725471
https://github.com/pydata/xarray/issues/1662#issuecomment-339646702,https://api.github.com/repos/pydata/xarray/issues/1662,339646702,MDEyOklzc3VlQ29tbWVudDMzOTY0NjcwMg==,1197350,2017-10-26T12:16:03Z,2017-10-26T12:16:03Z,MEMBER,"Hi Guillaume! Nice to see so many old friends showing up on the xarray repo...
The issue you raise is totally reasonable from a user perspective: missing values in datetime data should be permitted. But there are some upstream issues that make it challenging to solve (like most of our headaches related to datetime data).
In numpy (and computer arithmetic in general), NaN only exists in floating point datatypes. It is impossible to have a numpy datetime array with NaN in it:
```python
>>> a = np.array(['2010-01-01', '2010-01-02'], dtype='datetime64[ns]')
>>> a[0] = np.nan
ValueError: Could not convert object to NumPy datetime
```
The same error would be raised if `a` were an integer array; to get around that, xarray automatically casts integers with missing data to floats. But that approach obviously doesn't work with `datetime` dtypes.
Further downstream, xarray relies on [netcdf4-python's](https://github.com/Unidata/netcdf4-python) [num2date function](http://unidata.github.io/netcdf4-python/#netCDF4.num2date) to decode the date. The error is raised by that package.
This is my understanding of the problem. Some other folks here like @jhamman and @spencerkclark might have ideas about how to solve it. They are working on a new package called [netcdftime](https://github.com/Unidata/netcdftime) which will isolate and hopefully enhance such time encoding / decoding functions.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,268725471