html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1132#issuecomment-262557502,https://api.github.com/repos/pydata/xarray/issues/1132,262557502,MDEyOklzc3VlQ29tbWVudDI2MjU1NzUwMg==,3404817,2016-11-23T16:06:46Z,2016-11-23T16:06:46Z,CONTRIBUTOR,"Great, `safe_cast_to_index` works nicely (it passes my test). I added the change to the existing PR.

Do we need to add more groupby tests to make sure the solution is safe for other cases (e.g. other data types)?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-262233644,https://api.github.com/repos/pydata/xarray/issues/1132,262233644,MDEyOklzc3VlQ29tbWVudDI2MjIzMzY0NA==,3404817,2016-11-22T12:56:03Z,2016-11-22T13:21:04Z,CONTRIBUTOR,"OK, here is the minimal example:

```python
import xarray as xr
import pandas as pd

def test_groupby_da_datetime():
    """"""groupby with a DataArray of dtype datetime""""""
    # create test data
    times = pd.date_range('2000-01-01', periods=4)
    foo = xr.DataArray([1,2,3,4], coords=dict(time=times), dims='time')

    # create test index
    dd = times.to_datetime()
    reference_dates = [dd[0], dd[2]]
    labels = reference_dates[0:1]*2 + reference_dates[1:2]*2
    ind = xr.DataArray(labels, coords=dict(time=times), dims='time', name='reference_date')

    # group foo by ind
    g = foo.groupby(ind)
    
    # check result
    actual = g.sum(dim='time')
    expected = xr.DataArray([3,7], coords=dict(reference_date=reference_dates), dims='reference_date')
    assert actual.to_dataset(name='foo').equals(expected.to_dataset(name='foo'))
```

Making that, I found out that the problem only occurs when the DataArray used with `groupby` has **`dtype=datetime64[ns]`**.

The problem is that we effectively feed the DataArray to [`pd.factorize`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.factorize.html) and that goes well for most data types: Pandas checks with the function [`needs_i8_conversion`](https://github.com/pandas-dev/pandas/blob/v0.19.1/pandas/types/common.py#L248-L251) whether it can factorize the DataArray and decides YES for our `datetime64[ns]`. But then [in `pd.factorize`](https://github.com/pandas-dev/pandas/blob/v0.19.1/pandas/core/algorithms.py#L295-L307) it fails because it tries to access `DataArray.view` to convert to `int64`.

So as I see it there are three possible solutions to this:
1. Make Pandas' `pd.factorize` handle our `datetime` DataArrays better,
2. Add an attribute `.view` to DataArrays, or
3. Use the solution in the above PR, which means feeding only the NumPy `.values` to `pd.factorize`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531