home / github / issues

Menu
  • GraphQL API
  • Search all tables

issues: 1237552666

This data as json

id node_id number title user state locked assignee milestone comments created_at updated_at closed_at author_association active_lock_reason draft pull_request body reactions performed_via_github_app state_reason repo type
1237552666 I_kwDOAMm_X85Jw44a 6613 Flox can't handle cftime objects 20629530 closed 0     2 2022-05-16T18:35:56Z 2022-06-02T23:23:20Z 2022-06-02T23:23:20Z CONTRIBUTOR      

What happened?

I use resampling to count the number of timesteps within time periods. So the simple way is to : da.time.resample(time='YS').count(). With the current master, a non-standard calendar and with floxinstalled, this fails : flox can't handle the cftime objects of the time coordinate.

What did you expect to happen?

I expected the count of elements for each period to be returned.

Minimal Complete Verifiable Example

```Python import xarray as xr

timeNP = xr.DataArray(xr.date_range('2009-01-01', '2012-12-31', use_cftime=False), dims=('time',), name='time')

timeCF = xr.DataArray(xr.date_range('2009-01-01', '2012-12-31', use_cftime=True), dims=('time',), name='time')

timeNP.resample(time='YS').count() # works

timeCF.resample(time='YS').count() # Fails ```

MVCE confirmation

  • [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • [x] Complete example — the example is self-contained, including all data and the text of any traceback.
  • [x] Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • [x] New issue — a search of GitHub Issues suggests this is not a duplicate.

Relevant log output

```Python

TypeError Traceback (most recent call last) Input In [3], in <cell line: 1>() ----> 1 a.resample(time='YS').count()

File ~/Python/myxarray/xarray/core/_reductions.py:5456, in DataArrayResampleReductions.count(self, dim, keep_attrs, kwargs) 5401 """ 5402 Reduce this DataArray's data by applying count along some dimension(s). 5403 (...) 5453 * time (time) datetime64[ns] 2001-01-31 2001-04-30 2001-07-31 5454 """ 5455 if flox and OPTIONS["use_flox"] and contains_only_dask_or_numpy(self._obj): -> 5456 return self._flox_reduce( 5457 func="count", 5458 dim=dim, 5459 # fill_value=fill_value, 5460 keep_attrs=keep_attrs, 5461 kwargs, 5462 ) 5463 else: 5464 return self.reduce( 5465 duck_array_ops.count, 5466 dim=dim, 5467 keep_attrs=keep_attrs, 5468 **kwargs, 5469 )

File ~/Python/myxarray/xarray/core/resample.py:44, in Resample._flox_reduce(self, dim, kwargs) 41 labels = np.repeat(self._unique_coord.data, repeats) 42 group = DataArray(labels, dims=(self._group_dim,), name=self._unique_coord.name) ---> 44 result = super()._flox_reduce(dim=dim, group=group, kwargs) 45 result = self._maybe_restore_empty_groups(result) 46 result = result.rename({RESAMPLE_DIM: self._group_dim})

File ~/Python/myxarray/xarray/core/groupby.py:661, in GroupBy._flox_reduce(self, dim, kwargs) 658 expected_groups = (self._unique_coord.values,) 659 isbin = False --> 661 result = xarray_reduce( 662 self._original_obj.drop_vars(non_numeric), 663 group, 664 dim=dim, 665 expected_groups=expected_groups, 666 isbin=isbin, 667 kwargs, 668 ) 670 # Ignore error when the groupby reduction is effectively 671 # a reduction of the underlying dataset 672 result = result.drop_vars(unindexed_dims, errors="ignore")

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/xarray.py:308, in xarray_reduce(obj, func, expected_groups, isbin, sort, dim, split_out, fill_value, method, engine, keep_attrs, skipna, min_count, reindex, by, finalize_kwargs) 305 input_core_dims = _get_input_core_dims(group_names, dim, ds, grouper_dims) 306 input_core_dims += [input_core_dims[-1]] * (len(by) - 1) --> 308 actual = xr.apply_ufunc( 309 wrapper, 310 ds.drop_vars(tuple(missing_dim)).transpose(..., grouper_dims), 311 *by, 312 input_core_dims=input_core_dims, 313 # for xarray's test_groupby_duplicate_coordinate_labels 314 exclude_dims=set(dim), 315 output_core_dims=[group_names], 316 dask="allowed", 317 dask_gufunc_kwargs=dict(output_sizes=group_sizes), 318 keep_attrs=keep_attrs, 319 kwargs={ 320 "func": func, 321 "axis": axis, 322 "sort": sort, 323 "split_out": split_out, 324 "fill_value": fill_value, 325 "method": method, 326 "min_count": min_count, 327 "skipna": skipna, 328 "engine": engine, 329 "reindex": reindex, 330 "expected_groups": tuple(expected_groups), 331 "isbin": isbin, 332 "finalize_kwargs": finalize_kwargs, 333 }, 334 ) 336 # restore non-dim coord variables without the core dimension 337 # TODO: shouldn't apply_ufunc handle this? 338 for var in set(ds.variables) - set(ds.dims):

File ~/Python/myxarray/xarray/core/computation.py:1170, in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, args) 1168 # feed datasets apply_variable_ufunc through apply_dataset_vfunc 1169 elif any(is_dict_like(a) for a in args): -> 1170 return apply_dataset_vfunc( 1171 variables_vfunc, 1172 args, 1173 signature=signature, 1174 join=join, 1175 exclude_dims=exclude_dims, 1176 dataset_join=dataset_join, 1177 fill_value=dataset_fill_value, 1178 keep_attrs=keep_attrs, 1179 ) 1180 # feed DataArray apply_variable_ufunc through apply_dataarray_vfunc 1181 elif any(isinstance(a, DataArray) for a in args):

File ~/Python/myxarray/xarray/core/computation.py:460, in apply_dataset_vfunc(func, signature, join, dataset_join, fill_value, exclude_dims, keep_attrs, args) 455 list_of_coords, list_of_indexes = build_output_coords_and_indexes( 456 args, signature, exclude_dims, combine_attrs=keep_attrs 457 ) 458 args = [getattr(arg, "data_vars", arg) for arg in args] --> 460 result_vars = apply_dict_of_variables_vfunc( 461 func, args, signature=signature, join=dataset_join, fill_value=fill_value 462 ) 464 if signature.num_outputs > 1: 465 out = tuple( 466 _fast_dataset(*args) 467 for args in zip(result_vars, list_of_coords, list_of_indexes) 468 )

File ~/Python/myxarray/xarray/core/computation.py:402, in apply_dict_of_variables_vfunc(func, signature, join, fill_value, args) 400 result_vars = {} 401 for name, variable_args in zip(names, grouped_by_name): --> 402 result_vars[name] = func(variable_args) 404 if signature.num_outputs > 1: 405 return _unpack_dict_tuples(result_vars, signature.num_outputs)

File ~/Python/myxarray/xarray/core/computation.py:750, in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, args) 745 if vectorize: 746 func = _vectorize( 747 func, signature, output_dtypes=output_dtypes, exclude_dims=exclude_dims 748 ) --> 750 result_data = func(input_data) 752 if signature.num_outputs == 1: 753 result_data = (result_data,)

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/xarray.py:291, in xarray_reduce.<locals>.wrapper(array, func, skipna, by, kwargs) 288 if "nan" not in func and func not in ["all", "any", "count"]: 289 func = f"nan{func}" --> 291 result, groups = groupby_reduce(array, by, func=func, *kwargs) 292 return result

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:1553, in groupby_reduce(array, func, expected_groups, sort, isbin, axis, fill_value, min_count, split_out, method, engine, reindex, finalize_kwargs, by) 1550 agg = _initialize_aggregation(func, array.dtype, fill_value, min_count, finalize_kwargs) 1552 if not has_dask: -> 1553 results = _reduce_blockwise( 1554 array, by, agg, expected_groups=expected_groups, reindex=reindex, *kwargs 1555 ) 1556 groups = (results["groups"],) 1557 result = results[agg.name]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:1008, in _reduce_blockwise(array, by, agg, axis, expected_groups, fill_value, engine, sort, reindex) 1005 finalize_kwargs = (finalize_kwargs,) 1006 finalize_kwargs = finalize_kwargs + ({},) + ({},) -> 1008 results = chunk_reduce( 1009 array, 1010 by, 1011 func=agg.numpy, 1012 axis=axis, 1013 expected_groups=expected_groups, 1014 # This fill_value should only apply to groups that only contain NaN observations 1015 # BUT there is funkiness when axis is a subset of all possible values 1016 # (see below) 1017 fill_value=agg.fill_value["numpy"], 1018 dtype=agg.dtype["numpy"], 1019 kwargs=finalize_kwargs, 1020 engine=engine, 1021 sort=sort, 1022 reindex=reindex, 1023 ) # type: ignore 1025 if _is_arg_reduction(agg): 1026 results["intermediates"][0] = np.unravel_index(results["intermediates"][0], array.shape)[-1]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/core.py:677, in chunk_reduce(array, by, func, expected_groups, axis, fill_value, dtype, reindex, engine, kwargs, sort) 675 result = reduction(group_idx, array, kwargs) 676 else: --> 677 result = generic_aggregate( 678 group_idx, array, axis=-1, engine=engine, func=reduction, kwargs 679 ).astype(dt, copy=False) 680 if np.any(props.nanmask): 681 # remove NaN group label which should be last 682 result = result[..., :-1]

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/aggregations.py:49, in generic_aggregate(group_idx, array, engine, func, axis, size, fill_value, dtype, kwargs) 44 else: 45 raise ValueError( 46 f"Expected engine to be one of ['flox', 'numpy', 'numba']. Received {engine} instead." 47 ) ---> 49 return method( 50 group_idx, array, axis=axis, size=size, fill_value=fill_value, dtype=dtype, kwargs 51 )

File /opt/miniconda3/envs/xclim-pip/lib/python3.9/site-packages/flox/aggregate_flox.py:86, in nanlen(group_idx, array, args, kwargs) 85 def nanlen(group_idx, array, args, kwargs): ---> 86 return sum(group_idx, (~np.isnan(array)).astype(int), *args, kwargs)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe'' ```

Anything else we need to know?

I was able to resolve this by modifying xarray.core.utils.contains_only_dask_or_numpy as to return False if the input's dtype is 'O'. This check seems to only be used when choosing between flox and the old algos. Does this make sense?

Environment

INSTALLED VERSIONS ------------------ commit: None python: 3.9.6 | packaged by conda-forge | (default, Jul 11 2021, 03:39:48) [GCC 9.3.0] python-bits: 64 OS: Linux OS-release: 5.17.5-arch1-2 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: fr_CA.utf8 LOCALE: ('fr_CA', 'UTF-8') libhdf5: 1.12.0 libnetcdf: 4.7.4 xarray: 2022.3.1.dev16+g3ead17ea pandas: 1.4.2 numpy: 1.21.6 scipy: 1.7.1 netCDF4: 1.5.7 pydap: None h5netcdf: 0.11.0 h5py: 3.4.0 Nio: None zarr: 2.10.0 cftime: 1.5.0 nc_time_axis: 1.3.1 PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2022.04.1 distributed: 2022.4.1 matplotlib: 3.4.3 cartopy: None seaborn: None numbagg: None fsspec: 2021.07.0 cupy: None pint: 0.18 sparse: None flox: 0.5.1 numpy_groupies: 0.9.16 setuptools: 57.4.0 pip: 21.2.4 conda: None pytest: 6.2.5 IPython: 8.2.0 sphinx: 4.1.2
{
    "url": "https://api.github.com/repos/pydata/xarray/issues/6613/reactions",
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  completed 13221727 issue

Links from other tables

  • 2 rows from issues_id in issues_labels
  • 2 rows from issue in issue_comments
Powered by Datasette · Queries took 0.931ms · About: xarray-datasette