html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1287#issuecomment-612456493,https://api.github.com/repos/pydata/xarray/issues/1287,612456493,MDEyOklzc3VlQ29tbWVudDYxMjQ1NjQ5Mw==,2448579,2020-04-11T16:19:49Z,2020-04-11T16:19:49Z,MEMBER,Please reopen with a reproducible example if this is still an issue.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795
https://github.com/pydata/xarray/issues/1287#issuecomment-283132798,https://api.github.com/repos/pydata/xarray/issues/1287,283132798,MDEyOklzc3VlQ29tbWVudDI4MzEzMjc5OA==,7926249,2017-02-28T19:07:56Z,2017-02-28T19:10:18Z,NONE,"Looking back I see that before I would generally use dropna('time') before groupby. Without dropna I get the same error with xarray 0.8.2 and either h5py 2.5.0 or 2.6.0.
The following give the same result:
```Python
x.EBeam_ebeamPhotonEnergy.dropna('time').groupby('step').mean()
x.EBeam_ebeamPhotonEnergy.load().groupby('step').mean()
```
Note that even when I dropna on the dataset and then save and reload the data I see the same behavior, so the issue does not seem to simply be handling NA values.
Looking closer at the example I gave, I do see that our build has an inconsistency in which h5py is used. It looks like h5netcdf is compiled using h5py-2.7.0rc2, but h5py imports as 2.6.0. So this might just be related to our build.
```Python
In [21]: from h5netcdf import core
In [22]: core._NC_PROPERTIES
Out[22]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.18'
In [24]: core.h5py.__version__
Out[24]: '2.6.0'
```
I see it noted that h5py 2.6.0 passes tests in h5netcdf which other versions may not. I do not have a version with h5py 2.6.0 consistently used conveniently available. If it is believed that explicitly or implicitly loading data is not necessary, then I can try getting a consistent build with h5py 2.6.0, since this would be a convenient, although not strictly necessary, feature.
Note that in an earlier release where everything looks to be h5py 2.5.0, I get similar error without explicit load() of data. Below is a slightly different example with h5py 2.5.0 of using where method instead of groupby.
```Python-traceback
In [1]: from h5netcdf import core
In [2]: core.h5py.__version__
Out[2]: '2.5.0'
In [3]: core._NC_PROPERTIES
Out[3]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.17'
In [4]: import PyDataSource
In [5]: x = PyDataSource.open_h5netcdf(exp='mfx11116',run=25, subfolder='temp')
In [6]: x.step.where(abs(x.step-5)<2, drop=True)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 x.step.where(abs(x.step-5)<2, drop=True)
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/common.pyc in where(self, cond, other, drop)
572 for adim in np.nonzero(clipcond.values)]))
573 outcond = cond.isel(**clip)
--> 574 outobj = self.sel(**outcond.indexes)
575 else:
576 outobj = self
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/coordinates.pyc in __getitem__(self, key)
245 def __getitem__(self, key):
246 if key in self:
--> 247 return self._variables[key].to_index()
248 else:
249 raise KeyError(key)
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in to_index(self)
1165 # basically free as pandas.Index objects are immutable
1166 assert self.ndim == 1
-> 1167 index = self._data_cached().array
1168 if isinstance(index, pd.MultiIndex):
1169 # set default names for multi-index unnamed levels so that
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in _data_cached(self)
1088 def _data_cached(self):
1089 if not isinstance(self._data, PandasIndexAdapter):
-> 1090 self._data = PandasIndexAdapter(self._data)
1091 return self._data
1092
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __init__(self, array, dtype)
437 """"""
438 def __init__(self, array, dtype=None):
--> 439 self.array = utils.safe_cast_to_index(array)
440 if dtype is None:
441 if isinstance(array, pd.PeriodIndex):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in safe_cast_to_index(array)
56 if hasattr(array, 'dtype') and array.dtype.kind == 'O':
57 kwargs['dtype'] = object
---> 58 index = pd.Index(np.asarray(array), **kwargs)
59 return index
60
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
480
481 """"""
--> 482 return array(a, dtype, copy=False, order=order)
483
484 def asanyarray(a, dtype=None, order=None):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __array__(self, dtype)
353 def __array__(self, dtype=None):
354 array = orthogonally_indexable(self.array)
--> 355 return np.asarray(array[self.key], dtype=None)
356
357 def __getitem__(self, key):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in __getitem__(self, key)
394 def __getitem__(self, key):
395 return decode_cf_datetime(self.array[key], units=self.units,
--> 396 calendar=self.calendar)
397
398
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf_datetime(num_dates, units, calendar)
124 netCDF4.num2date
125 """"""
--> 126 num_dates = np.asarray(num_dates, dtype=float)
127 flat_num_dates = num_dates.ravel()
128 if calendar is None:
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
480
481 """"""
--> 482 return array(a, dtype, copy=False, order=order)
483
484 def asanyarray(a, dtype=None, order=None):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in __array__(self, dtype)
353 def __array__(self, dtype=None):
354 array = orthogonally_indexable(self.array)
--> 355 return np.asarray(array[self.key], dtype=None)
356
357 def __getitem__(self, key):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in __getitem__(self, key)
393
394 def __getitem__(self, key):
--> 395 return self.array[key]
396
397 def __repr__(self):
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5netcdf/core.pyc in __getitem__(self, key)
96
97 def __getitem__(self, key):
---> 98 return self._h5ds[key]
99
100 def __setitem__(self, key, value):
h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:3032)()
h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:2990)()
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/dataset.pyc in __getitem__(self, args)
429
430 # Perform the dataspace selection.
--> 431 selection = sel.select(self.shape, args, dsid=self.id)
432
433 if selection.nselect == 0:
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in select(shape, args, dsid)
77 elif isinstance(arg, np.ndarray):
78 sel = PointSelection(shape)
---> 79 sel[arg]
80 return sel
81
/reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in __getitem__(self, arg)
215 """""" Perform point-wise selection from a NumPy boolean array """"""
216 if not (isinstance(arg, np.ndarray) and arg.dtype.kind == 'b'):
--> 217 raise TypeError(""PointSelection __getitem__ only works with bool arrays"")
218 if not arg.shape == self.shape:
219 raise TypeError(""Boolean indexing array has incompatible shape"")
TypeError: PointSelection __getitem__ only works with bool arrays
In [7]: x.load().step.where(abs(x.step-5)<2, drop=True)
Out[7]:
array([ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4.,
```","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795
https://github.com/pydata/xarray/issues/1287#issuecomment-282952515,https://api.github.com/repos/pydata/xarray/issues/1287,282952515,MDEyOklzc3VlQ29tbWVudDI4Mjk1MjUxNQ==,1217238,2017-02-28T06:16:07Z,2017-02-28T06:16:07Z,MEMBER,Hmm. Is this using the exact same version of h5py?,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,210651795