html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1287#issuecomment-612456493,https://api.github.com/repos/pydata/xarray/issues/1287,612456493,MDEyOklzc3VlQ29tbWVudDYxMjQ1NjQ5Mw==,2448579,2020-04-11T16:19:49Z,2020-04-11T16:19:49Z,MEMBER,Please reopen with a reproducible example if this is still an issue.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795 https://github.com/pydata/xarray/issues/1287#issuecomment-283132798,https://api.github.com/repos/pydata/xarray/issues/1287,283132798,MDEyOklzc3VlQ29tbWVudDI4MzEzMjc5OA==,7926249,2017-02-28T19:07:56Z,2017-02-28T19:10:18Z,NONE,"Looking back I see that before I would generally use dropna('time') before groupby. Without dropna I get the same error with xarray 0.8.2 and either h5py 2.5.0 or 2.6.0. The following give the same result: ```Python x.EBeam_ebeamPhotonEnergy.dropna('time').groupby('step').mean() x.EBeam_ebeamPhotonEnergy.load().groupby('step').mean() ``` Note that even when I dropna on the dataset and then save and reload the data I see the same behavior, so the issue does not seem to simply be handling NA values. Looking closer at the example I gave, I do see that our build has an inconsistency in which h5py is used. It looks like h5netcdf is compiled using h5py-2.7.0rc2, but h5py imports as 2.6.0. So this might just be related to our build. ```Python In [21]: from h5netcdf import core In [22]: core._NC_PROPERTIES Out[22]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.18' In [24]: core.h5py.__version__ Out[24]: '2.6.0' ``` I see it noted that h5py 2.6.0 passes tests in h5netcdf which other versions may not. I do not have a version with h5py 2.6.0 consistently used conveniently available. If it is believed that explicitly or implicitly loading data is not necessary, then I can try getting a consistent build with h5py 2.6.0, since this would be a convenient, although not strictly necessary, feature. Note that in an earlier release where everything looks to be h5py 2.5.0, I get similar error without explicit load() of data. Below is a slightly different example with h5py 2.5.0 of using where method instead of groupby. ```Python-traceback In [1]: from h5netcdf import core In [2]: core.h5py.__version__ Out[2]: '2.5.0' In [3]: core._NC_PROPERTIES Out[3]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.17' In [4]: import PyDataSource In [5]: x = PyDataSource.open_h5netcdf(exp='mfx11116',run=25, subfolder='temp') In [6]: x.step.where(abs(x.step-5)<2, drop=True) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 x.step.where(abs(x.step-5)<2, drop=True) /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/common.pyc in where(self, cond, other, drop) 572 for adim in np.nonzero(clipcond.values)])) 573 outcond = cond.isel(**clip) --> 574 outobj = self.sel(**outcond.indexes) 575 else: 576 outobj = self /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/coordinates.pyc in __getitem__(self, key) 245 def __getitem__(self, key): 246 if key in self: --> 247 return self._variables[key].to_index() 248 else: 249 raise KeyError(key) /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in to_index(self) 1165 # basically free as pandas.Index objects are immutable 1166 assert self.ndim == 1 -> 1167 index = self._data_cached().array 1168 if isinstance(index, pd.MultiIndex): 1169 # set default names for multi-index unnamed levels so that /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in _data_cached(self) 1088 def _data_cached(self): 1089 if not isinstance(self._data, PandasIndexAdapter): -> 1090 self._data = PandasIndexAdapter(self._data) 1091 return self._data 1092 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __init__(self, array, dtype) 437 """""" 438 def __init__(self, array, dtype=None): --> 439 self.array = utils.safe_cast_to_index(array) 440 if dtype is None: 441 if isinstance(array, pd.PeriodIndex): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in safe_cast_to_index(array) 56 if hasattr(array, 'dtype') and array.dtype.kind == 'O': 57 kwargs['dtype'] = object ---> 58 index = pd.Index(np.asarray(array), **kwargs) 59 return index 60 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 480 481 """""" --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __array__(self, dtype) 353 def __array__(self, dtype=None): 354 array = orthogonally_indexable(self.array) --> 355 return np.asarray(array[self.key], dtype=None) 356 357 def __getitem__(self, key): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in __getitem__(self, key) 394 def __getitem__(self, key): 395 return decode_cf_datetime(self.array[key], units=self.units, --> 396 calendar=self.calendar) 397 398 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf_datetime(num_dates, units, calendar) 124 netCDF4.num2date 125 """""" --> 126 num_dates = np.asarray(num_dates, dtype=float) 127 flat_num_dates = num_dates.ravel() 128 if calendar is None: /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 480 481 """""" --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __array__(self, dtype) 353 def __array__(self, dtype=None): 354 array = orthogonally_indexable(self.array) --> 355 return np.asarray(array[self.key], dtype=None) 356 357 def __getitem__(self, key): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in __getitem__(self, key) 393 394 def __getitem__(self, key): --> 395 return self.array[key] 396 397 def __repr__(self): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5netcdf/core.pyc in __getitem__(self, key) 96 97 def __getitem__(self, key): ---> 98 return self._h5ds[key] 99 100 def __setitem__(self, key, value): h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:3032)() h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:2990)() /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/dataset.pyc in __getitem__(self, args) 429 430 # Perform the dataspace selection. --> 431 selection = sel.select(self.shape, args, dsid=self.id) 432 433 if selection.nselect == 0: /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in select(shape, args, dsid) 77 elif isinstance(arg, np.ndarray): 78 sel = PointSelection(shape) ---> 79 sel[arg] 80 return sel 81 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in __getitem__(self, arg) 215 """""" Perform point-wise selection from a NumPy boolean array """""" 216 if not (isinstance(arg, np.ndarray) and arg.dtype.kind == 'b'): --> 217 raise TypeError(""PointSelection __getitem__ only works with bool arrays"") 218 if not arg.shape == self.shape: 219 raise TypeError(""Boolean indexing array has incompatible shape"") TypeError: PointSelection __getitem__ only works with bool arrays In [7]: x.load().step.where(abs(x.step-5)<2, drop=True) Out[7]: array([ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., ```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795 https://github.com/pydata/xarray/issues/1287#issuecomment-282952515,https://api.github.com/repos/pydata/xarray/issues/1287,282952515,MDEyOklzc3VlQ29tbWVudDI4Mjk1MjUxNQ==,1217238,2017-02-28T06:16:07Z,2017-02-28T06:16:07Z,MEMBER,Hmm. Is this using the exact same version of h5py?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795