issue_comments
1 row where author_association = "NONE" and issue = 210651795 sorted by updated_at descending
This data as json, CSV (advanced)
Suggested facets: created_at (date), updated_at (date)
issue 1
- Groupby method on larger files fails unless explicitly load data in 0.9.1 · 1 ✖
id | html_url | issue_url | node_id | user | created_at | updated_at ▲ | author_association | body | reactions | performed_via_github_app | issue |
---|---|---|---|---|---|---|---|---|---|---|---|
283132798 | https://github.com/pydata/xarray/issues/1287#issuecomment-283132798 | https://api.github.com/repos/pydata/xarray/issues/1287 | MDEyOklzc3VlQ29tbWVudDI4MzEzMjc5OA== | koglin 7926249 | 2017-02-28T19:07:56Z | 2017-02-28T19:10:18Z | NONE | Looking back I see that before I would generally use dropna('time') before groupby. Without dropna I get the same error with xarray 0.8.2 and either h5py 2.5.0 or 2.6.0. The following give the same result:
Note that even when I dropna on the dataset and then save and reload the data I see the same behavior, so the issue does not seem to simply be handling NA values. Looking closer at the example I gave, I do see that our build has an inconsistency in which h5py is used. It looks like h5netcdf is compiled using h5py-2.7.0rc2, but h5py imports as 2.6.0. So this might just be related to our build. ```Python In [21]: from h5netcdf import core In [22]: core._NC_PROPERTIES Out[22]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.18' In [24]: core.h5py.version Out[24]: '2.6.0' ``` I see it noted that h5py 2.6.0 passes tests in h5netcdf which other versions may not. I do not have a version with h5py 2.6.0 consistently used conveniently available. If it is believed that explicitly or implicitly loading data is not necessary, then I can try getting a consistent build with h5py 2.6.0, since this would be a convenient, although not strictly necessary, feature. Note that in an earlier release where everything looks to be h5py 2.5.0, I get similar error without explicit load() of data. Below is a slightly different example with h5py 2.5.0 of using where method instead of groupby. ```Python-traceback In [1]: from h5netcdf import core In [2]: core.h5py.version Out[2]: '2.5.0' In [3]: core._NC_PROPERTIES Out[3]: u'version=1|h5netcdfversion=0.3.1|hdf5libversion=1.8.17' In [4]: import PyDataSource In [5]: x = PyDataSource.open_h5netcdf(exp='mfx11116',run=25, subfolder='temp') In [6]: x.step.where(abs(x.step-5)<2, drop=True)TypeError Traceback (most recent call last) <ipython-input-6-f13d4f4e92e7> in <module>() ----> 1 x.step.where(abs(x.step-5)<2, drop=True) /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/common.pyc in where(self, cond, other, drop) 572 for adim in np.nonzero(clipcond.values)])) 573 outcond = cond.isel(clip) --> 574 outobj = self.sel(outcond.indexes) 575 else: 576 outobj = self /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/coordinates.pyc in getitem(self, key) 245 def getitem(self, key): 246 if key in self: --> 247 return self._variables[key].to_index() 248 else: 249 raise KeyError(key) /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in to_index(self) 1165 # basically free as pandas.Index objects are immutable 1166 assert self.ndim == 1 -> 1167 index = self._data_cached().array 1168 if isinstance(index, pd.MultiIndex): 1169 # set default names for multi-index unnamed levels so that /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/variable.pyc in _data_cached(self) 1088 def _data_cached(self): 1089 if not isinstance(self._data, PandasIndexAdapter): -> 1090 self._data = PandasIndexAdapter(self._data) 1091 return self._data 1092 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in init(self, array, dtype) 437 """ 438 def init(self, array, dtype=None): --> 439 self.array = utils.safe_cast_to_index(array) 440 if dtype is None: 441 if isinstance(array, pd.PeriodIndex): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in safe_cast_to_index(array) 56 if hasattr(array, 'dtype') and array.dtype.kind == 'O': 57 kwargs['dtype'] = object ---> 58 index = pd.Index(np.asarray(array), **kwargs) 59 return index 60 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in array(self, dtype) 353 def array(self, dtype=None): 354 array = orthogonally_indexable(self.array) --> 355 return np.asarray(array[self.key], dtype=None) 356 357 def getitem(self, key): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in getitem(self, key) 394 def getitem(self, key): 395 return decode_cf_datetime(self.array[key], units=self.units, --> 396 calendar=self.calendar) 397 398 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf_datetime(num_dates, units, calendar) 124 netCDF4.num2date 125 """ --> 126 num_dates = np.asarray(num_dates, dtype=float) 127 flat_num_dates = num_dates.ravel() 128 if calendar is None: /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order) 480 481 """ --> 482 return array(a, dtype, copy=False, order=order) 483 484 def asanyarray(a, dtype=None, order=None): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/indexing.pyc in array(self, dtype) 353 def array(self, dtype=None): 354 array = orthogonally_indexable(self.array) --> 355 return np.asarray(array[self.key], dtype=None) 356 357 def getitem(self, key): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/xarray/core/utils.pyc in getitem(self, key) 393 394 def getitem(self, key): --> 395 return self.array[key] 396 397 def repr(self): /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5netcdf/core.pyc in getitem(self, key) 96 97 def getitem(self, key): ---> 98 return self._h5ds[key] 99 100 def setitem(self, key, value): h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:3032)() h5py/_objects.pyx in h5py._objects.with_phil.wrapper (/reg/g/psdm/sw/conda/inst/miniconda2-dev-rhel7/conda-bld/h5py-2.5.0_1474173302699/work/h5py-2.5.0/h5py/_objects.c:2990)() /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/dataset.pyc in getitem(self, args) 429 430 # Perform the dataspace selection. --> 431 selection = sel.select(self.shape, args, dsid=self.id) 432 433 if selection.nselect == 0: /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in select(shape, args, dsid) 77 elif isinstance(arg, np.ndarray): 78 sel = PointSelection(shape) ---> 79 sel[arg] 80 return sel 81 /reg/g/psdm/sw/conda/inst/miniconda2-prod-rhel7/envs/ana-1.0.8/lib/python2.7/site-packages/h5py/_hl/selections.pyc in getitem(self, arg) 215 """ Perform point-wise selection from a NumPy boolean array """ 216 if not (isinstance(arg, np.ndarray) and arg.dtype.kind == 'b'): --> 217 raise TypeError("PointSelection getitem only works with bool arrays") 218 if not arg.shape == self.shape: 219 raise TypeError("Boolean indexing array has incompatible shape") TypeError: PointSelection getitem only works with bool arrays In [7]: x.load().step.where(abs(x.step-5)<2, drop=True) Out[7]: <xarray.DataArray 'step' (time: 725)> array([ 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., 4., ``` |
{ "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
Groupby method on larger files fails unless explicitly load data in 0.9.1 210651795 |
Advanced export
JSON shape: default, array, newline-delimited, object
CREATE TABLE [issue_comments] ( [html_url] TEXT, [issue_url] TEXT, [id] INTEGER PRIMARY KEY, [node_id] TEXT, [user] INTEGER REFERENCES [users]([id]), [created_at] TEXT, [updated_at] TEXT, [author_association] TEXT, [body] TEXT, [reactions] TEXT, [performed_via_github_app] TEXT, [issue] INTEGER REFERENCES [issues]([id]) ); CREATE INDEX [idx_issue_comments_issue] ON [issue_comments] ([issue]); CREATE INDEX [idx_issue_comments_user] ON [issue_comments] ([user]);
user 1