issues: 354298235
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
354298235 | MDU6SXNzdWUzNTQyOTgyMzU= | 2383 | groupby().apply() on variable with NaNs raises IndexError | 13662783 | closed | 0 | 1 | 2018-08-27T12:27:06Z | 2019-10-28T23:46:41Z | 2019-10-28T23:46:41Z | CONTRIBUTOR | Code Sample```python import xarray as xr import numpy as np def standardize(x): return (x - x.mean()) / x.std() ds = xr.Dataset() ds["variable"] = xr.DataArray(np.random.rand(4,3,5), {"lat":np.arange(4), "lon":np.arange(3), "time":np.arange(5)}, ("lat", "lon", "time"), ) ds["id"] = xr.DataArray(np.arange(12.0).reshape((4,3)), {"lat": np.arange(4), "lon":np.arange(3)}, ("lat", "lon"), ) ds["id"].values[0,0] = np.nan ds.groupby("id").apply(standardize) ``` Problem descriptionThis results in an IndexError. This is mildly confusing, it took me a little while to figure out the NaN's were to blame. I'm guessing the NaN doesn't get filtered out everywhere. The traceback: ``` IndexError Traceback (most recent call last) <ipython-input-2-267ba57bc264> in <module>() 15 ds["id"].values[0,0] = np.nan 16 ---> 17 ds.groupby("id").apply(standardize) C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in apply(self, func, kwargs) 607 kwargs.pop('shortcut', None) # ignore shortcut if set (for now) 608 applied = (func(ds, kwargs) for ds in self._iter_grouped()) --> 609 return self._combine(applied) 610 611 def _combine(self, applied): C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _combine(self, applied) 614 coord, dim, positions = self._infer_concat_args(applied_example) 615 combined = concat(applied, dim) --> 616 combined = _maybe_reorder(combined, dim, positions) 617 if coord is not None: 618 combined[coord.name] = coord C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _maybe_reorder(xarray_obj, dim, positions) 428 429 def _maybe_reorder(xarray_obj, dim, positions): --> 430 order = _inverse_permutation_indices(positions) 431 432 if order is None: C:\Miniconda3\envs\main\lib\site-packages\xarray\core\groupby.py in _inverse_permutation_indices(positions) 109 positions = [np.arange(sl.start, sl.stop, sl.step) for sl in positions] 110 --> 111 indices = nputils.inverse_permutation(np.concatenate(positions)) 112 return indices 113 C:\Miniconda3\envs\main\lib\site-packages\xarray\core\nputils.py in inverse_permutation(indices) 52 # use intp instead of int64 because of windows :( 53 inverse_permutation = np.empty(len(indices), dtype=np.intp) ---> 54 inverse_permutation[indices] = np.arange(len(indices), dtype=np.intp) 55 return inverse_permutation 56 IndexError: index 11 is out of bounds for axis 0 with size 11 ``` Expected OutputMy assumption was that it would throw out the values that fall within the NaN group, like ```python import pandas as pd import numpy as np df = pd.DataFrame() df["var"] = np.random.rand(10) df["id"] = np.arange(10) df["id"].iloc[0:2] = np.nan df.groupby("id").mean() ``` Out:
Output of
|
{ "url": "https://api.github.com/repos/pydata/xarray/issues/2383/reactions", "total_count": 1, "+1": 1, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |