issues: 158958801
This data as json
id | node_id | number | title | user | state | locked | assignee | milestone | comments | created_at | updated_at | closed_at | author_association | active_lock_reason | draft | pull_request | body | reactions | performed_via_github_app | state_reason | repo | type |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
158958801 | MDU6SXNzdWUxNTg5NTg4MDE= | 873 | Broadcast error when dataset is recombined after a stack/groupby/apply/unstack sequence | 1328158 | closed | 0 | 11 | 2016-06-07T15:50:43Z | 2016-09-20T19:55:55Z | 2016-09-20T19:55:55Z | NONE | I have code which performs the split-apply-combine pattern on a dataset, and it appears to work as expected until it reaches a point where the dataset is being recombined. At this point it seems that there's a dimensional mismatch between arrays which is causing numpy to raise a broadcasting error (below). The code which can cause this error is in a gist here When I run the code I see the following errors/traceback: ``` Traceback (most recent call last): File "H:\git\climate_indices\src\scripts\xarray_groupby_example.py", line 34, in <module> dataset = dataset.groupby('grid_cells').apply(double_data) File "C:\Anaconda\lib\site-packages\xarray\core\groupby.py", line 469, in apply combined = self._concat(applied) File "C:\Anaconda\lib\site-packages\xarray\core\groupby.py", line 476, in _concat combined = concat(applied, concat_dim, positions=positions) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 114, in concat return f(objs, dim, data_vars, coords, compat, positions) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 268, in _dataset_concat combined = Variable.concat(vars, dim, positions) File "C:\Anaconda\lib\site-packages\xarray\core\variable.py", line 919, in concat variables = list(variables) File "C:\Anaconda\lib\site-packages\xarray\core\combine.py", line 262, in ensure_common_dims var = var.expand_dims(common_dims, common_shape) File "C:\Anaconda\lib\site-packages\xarray\core\variable.py", line 717, in expand_dims expanded_data = ops.broadcast_to(self.data, tmp_shape) File "C:\Anaconda\lib\site-packages\xarray\core\ops.py", line 67, in f return getattr(module, name)(args, *kwargs) File "C:\Anaconda\lib\site-packages\numpy\lib\stride_tricks.py", line 115, in broadcast_to return _broadcast_to(array, shape, subok=subok, readonly=True) File "C:\Anaconda\lib\site-packages\numpy\lib\stride_tricks.py", line 70, in _broadcast_to op_flags=[op_flag], itershape=shape, order='C').itviews[0] ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,) and requested shape (1,) ``` I get the above error when I use NetCDF input files which contain three dimensions (time, lon, lat), a simple example of which is described below: ``` Dataset type: Hierarchical Data Format, version 5 netcdf file:/C:/home/tmp/toy.nc { dimensions: lat = 2; lon = 2; time = 3; variables: int prcp(time=3, lon=2, lat=2); double lat(lat=2); double lon(lon=2); long time(time=3); :calendar = "proleptic_gregorian"; :units = "days since 2014-06-09 00:00:00"; } ``` |
{ "url": "https://api.github.com/repos/pydata/xarray/issues/873/reactions", "total_count": 0, "+1": 0, "-1": 0, "laugh": 0, "hooray": 0, "confused": 0, "heart": 0, "rocket": 0, "eyes": 0 } |
completed | 13221727 | issue |