html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1132#issuecomment-268021977,https://api.github.com/repos/pydata/xarray/issues/1132,268021977,MDEyOklzc3VlQ29tbWVudDI2ODAyMTk3Nw==,900941,2016-12-19T17:15:17Z,2016-12-19T17:15:17Z,CONTRIBUTOR,"Thanks, I'll try to use the github version.
Cheers","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-268021188,https://api.github.com/repos/pydata/xarray/issues/1132,268021188,MDEyOklzc3VlQ29tbWVudDI2ODAyMTE4OA==,1217238,2016-12-19T17:12:16Z,2016-12-19T17:12:16Z,MEMBER,"@guziy This should be fixed by #1133, which will be part of the next release.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-268020632,https://api.github.com/repos/pydata/xarray/issues/1132,268020632,MDEyOklzc3VlQ29tbWVudDI2ODAyMDYzMg==,900941,2016-12-19T17:10:09Z,2016-12-19T17:10:09Z,CONTRIBUTOR,"Hi:
Here I have an example that have worked until recently...
https://github.com/guziy/PyNotebooks/blob/master/xarray/test_grouping.ipynb
Of course there is a resample function, but sometimes you want grouping...
Are there any plans to add a function as a key to the groupby function?
Cheers
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-262557502,https://api.github.com/repos/pydata/xarray/issues/1132,262557502,MDEyOklzc3VlQ29tbWVudDI2MjU1NzUwMg==,3404817,2016-11-23T16:06:46Z,2016-11-23T16:06:46Z,CONTRIBUTOR,"Great, `safe_cast_to_index` works nicely (it passes my test). I added the change to the existing PR.
Do we need to add more groupby tests to make sure the solution is safe for other cases (e.g. other data types)?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-262286432,https://api.github.com/repos/pydata/xarray/issues/1132,262286432,MDEyOklzc3VlQ29tbWVudDI2MjI4NjQzMg==,1217238,2016-11-22T16:17:51Z,2016-11-22T16:17:51Z,MEMBER,"Thanks for looking into this!
Based on how `factorize` works (with specialized handling for pandas dtypes), I think the most robust behavior would be to pass in a `pandas.Index`. For example, this will work better if someone uses a `pandas.PeriodIndex`. So I would suggest wrapping arrays with [`safe_cast_to_index`](https://github.com/pydata/xarray/blob/76726e58710122e302790a05ae092c30d41d3553/xarray/core/utils.py#L40) before passing them to `pd.factorize`.
I'm not sure if we want a `.view` attribute on DataArrays, but in any case it's not clear that would even fix the issue here -- pandas probably needs to a coerce to numpy arrays internally in factorize eventually anyways.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-262233644,https://api.github.com/repos/pydata/xarray/issues/1132,262233644,MDEyOklzc3VlQ29tbWVudDI2MjIzMzY0NA==,3404817,2016-11-22T12:56:03Z,2016-11-22T13:21:04Z,CONTRIBUTOR,"OK, here is the minimal example:
```python
import xarray as xr
import pandas as pd
def test_groupby_da_datetime():
""""""groupby with a DataArray of dtype datetime""""""
# create test data
times = pd.date_range('2000-01-01', periods=4)
foo = xr.DataArray([1,2,3,4], coords=dict(time=times), dims='time')
# create test index
dd = times.to_datetime()
reference_dates = [dd[0], dd[2]]
labels = reference_dates[0:1]*2 + reference_dates[1:2]*2
ind = xr.DataArray(labels, coords=dict(time=times), dims='time', name='reference_date')
# group foo by ind
g = foo.groupby(ind)
# check result
actual = g.sum(dim='time')
expected = xr.DataArray([3,7], coords=dict(reference_date=reference_dates), dims='reference_date')
assert actual.to_dataset(name='foo').equals(expected.to_dataset(name='foo'))
```
Making that, I found out that the problem only occurs when the DataArray used with `groupby` has **`dtype=datetime64[ns]`**.
The problem is that we effectively feed the DataArray to [`pd.factorize`](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.factorize.html) and that goes well for most data types: Pandas checks with the function [`needs_i8_conversion`](https://github.com/pandas-dev/pandas/blob/v0.19.1/pandas/types/common.py#L248-L251) whether it can factorize the DataArray and decides YES for our `datetime64[ns]`. But then [in `pd.factorize`](https://github.com/pandas-dev/pandas/blob/v0.19.1/pandas/core/algorithms.py#L295-L307) it fails because it tries to access `DataArray.view` to convert to `int64`.
So as I see it there are three possible solutions to this:
1. Make Pandas' `pd.factorize` handle our `datetime` DataArrays better,
2. Add an attribute `.view` to DataArrays, or
3. Use the solution in the above PR, which means feeding only the NumPy `.values` to `pd.factorize`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531
https://github.com/pydata/xarray/issues/1132#issuecomment-261979024,https://api.github.com/repos/pydata/xarray/issues/1132,261979024,MDEyOklzc3VlQ29tbWVudDI2MTk3OTAyNA==,1217238,2016-11-21T15:58:29Z,2016-11-21T15:58:29Z,MEMBER,"This looks like a plausible fix, but what would be really helpful is a minimal, complete example that triggers the error. That should help clarify the issue and at the least, we will need that for a test case in your pull request.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,190683531