id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1888573893,I_kwDOAMm_X85wkVnF,8161,groupby bug,6420873,closed,0,,,2,2023-09-09T04:38:48Z,2023-09-13T20:03:52Z,2023-09-13T20:03:52Z,NONE,,,,"### What happened? Sometimes, when performing a groupby operation on a multidimensional data array, it can return unexpected results. A copy of the test data could be found [here](https://miami.box.com/s/8cl45s01sbp7x503dyjhmi2wz96suzj9). Code to reproduce the bug: ``` import xarray as xr ds = xr.open_dataarray('test1.nc').load() ``` ds is a 100x86x25x66 array ![CleanShot 2023-09-09 at 00 26 51](https://github.com/pydata/xarray/assets/6420873/a0d49cec-02dd-46a3-801c-18aa778af112) ``` amoc1 = ds.isel(member_id=range(50)).stack(mb_time=['member_id', 'time']) amoc1 = amoc1.groupby('mb_time').max(...) amoc1 = amoc1.unstack() amoc1 ``` performing groupby on the first 50 members, results look fine. ![CleanShot 2023-09-09 at 00 28 24](https://github.com/pydata/xarray/assets/6420873/8581c226-1b0c-450f-9a4b-32fc99505582) ``` amoc2 = ds.isel(member_id=range(50, 100)).stack(mb_time=['member_id', 'time']) amoc2 = amoc2.groupby('mb_time').max(...) amoc2 = amoc2.unstack() amoc2 ``` performing groupby on the last 50 members, results look fine as well. ![CleanShot 2023-09-09 at 00 30 22](https://github.com/pydata/xarray/assets/6420873/a74242a1-e9a3-48f5-9adf-b1e5c9f146ab) ``` amoc = ds.isel(member_id=range(0, 100)).stack(mb_time=['member_id', 'time']) amoc = amoc.groupby('mb_time').max(...) amoc = amoc.unstack() amoc ``` performing groupby on the whole 100 members, **results look weird**. ![CleanShot 2023-09-09 at 00 31 14](https://github.com/pydata/xarray/assets/6420873/d14c0292-517a-414b-8967-4f1be44e0bbe) > ### What did you expect to happen? _No response_ ### Minimal Complete Verifiable Example _No response_ ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output _No response_ ### Anything else we need to know? _No response_ ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0] python-bits: 64 OS: Linux OS-release: 3.10.0-1127.18.2.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.1 libnetcdf: 4.8.1 xarray: 2023.7.0 pandas: 1.5.3 numpy: 1.24.4 scipy: 1.10.1 netCDF4: 1.6.2 pydap: installed h5netcdf: 1.0.0 h5py: 3.7.0 Nio: None zarr: 2.12.0 cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: None iris: None bottleneck: 1.3.7 dask: 2023.7.1 distributed: 2023.7.1 matplotlib: 3.4.3 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.11.0 cupy: None pint: 0.19.2 sparse: None flox: None numpy_groupies: None setuptools: 68.0.0 pip: 22.1.2 conda: 23.3.1 pytest: None mypy: None IPython: 7.33.0 sphinx: 5.0.1
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8161/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,not_planned,13221727,issue 187393785,MDU6SXNzdWUxODczOTM3ODU=,1081,Transpose some but not all dimensions,6420873,closed,0,,,17,2016-11-04T17:31:38Z,2019-10-29T19:16:58Z,2019-10-29T19:16:58Z,NONE,,,,"Hi, all Sorry to bother. Maybe it is a kind of stupid question for others, but I cannot figure it out at this moment. I want to swap dims in xarray, like swapaxes in numpy. I found both dataarray and dataset has method ```swap_dims```, but I don't understand its arguments: ```dims_dict : dict-like Dictionary whose keys are current dimension names and whose values are new names. Each value must already be a coordinate on this array.``` Here is my example: ``` data = np.random.rand(4,3) lon = [1,2,3] lat = [4,3,2,1] foo = xr.DataArray(data,coords=[lat,lon]) foo foo = xr.DataArray(data,coords=[lat,lon],dims=['lat','lon']) foo foo.swap_dims({'lat':'lon'}) ``` The error message: ``` --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 foo.swap_dims({'lat':'lon'}) /glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataarray.pyc in swap_dims(self, dims_dict) 794 Dataset.swap_dims 795 """""" --> 796 ds = self._to_temp_dataset().swap_dims(dims_dict) 797 return self._from_temp_dataset(ds) 798 /glade/u/home/che43/miniconda2/lib/python2.7/site-packages/xarray/core/dataset.pyc in swap_dims(self, dims_dict, inplace) 1293 raise ValueError('replacement dimension %r is not a 1D ' 1294 'variable along the old dimension %r' -> 1295 % (v, k)) 1296 1297 result_dims = set(dims_dict.get(dim, dim) for dim in self.dims) ValueError: replacement dimension 'lon' is not a 1D variable along the old dimension 'lat' ``` Sorry to bother.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1081/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 227858207,MDU6SXNzdWUyMjc4NTgyMDc=,1403,Lost coords after multiplication,6420873,closed,0,,,4,2017-05-11T02:00:59Z,2019-04-11T17:53:16Z,2019-04-11T17:53:16Z,NONE,,,,"Recently, I occurred a bug: multiplication discards coords of dsarray. ``` In [1]: import xarray as xr In [2]: xr.__version__ Out[2]: '0.9.5' In [3]: tarea = xr.open_dataarray('tarea.nc') In [4]: tarea Out[4]: [122880 values with dtype=float64] Coordinates: TLAT (nlat, nlon) float64 -79.22 -79.22 -79.22 -79.22 -79.22 -79.22 ... TLONG (nlat, nlon) float64 320.6 321.7 322.8 323.9 325.1 326.2 327.3 ... Dimensions without coordinates: nlat, nlon Attributes: long_name: area of T cells units: centimeter^2 In [6]: advt = xr.open_dataarray('advt.nc') In [7]: advt Out[7]: [122880 values with dtype=float64] Coordinates: TLAT (nlat, nlon) float64 -79.22 -79.22 -79.22 -79.22 -79.22 -79.22 ... TLONG (nlat, nlon) float64 320.6 321.7 322.8 323.9 325.1 326.2 327.3 ... Dimensions without coordinates: nlat, nlon In [8]: advt * tarea Out[8]: array([[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ 8091417.091781, 15948194.682816, -49201736.790674, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]]) Dimensions without coordinates: nlat, nlon ``` TLAT and TLONG are gone. Any suggestion? Here I provide my test [data](https://github.com/Yefee/bugs-fix/tree/master/lost_coords).","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1403/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 221855729,MDU6SXNzdWUyMjE4NTU3Mjk=,1374,indexing error in sel subsets,6420873,closed,0,,,6,2017-04-14T17:45:01Z,2017-06-04T07:03:48Z,2017-06-04T07:03:48Z,NONE,,,,"``` import xarray as xr xr.__version__ '0.9.1' ds = xr.open_dataset('lgm2co2.nc') ds Dimensions: (lat_aux_grid: 105, moc_comp: 1, moc_z: 26, time: 2204, transport_reg: 2) Coordinates: * time (time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ... * lat_aux_grid (lat_aux_grid) float32 -80.2602 -78.7338 -77.2176 ... * moc_z (moc_z) float32 0.0 800.0 1644.05 2573.71 3627.36 ... moc_components (moc_comp) |S512 b'Eulerian Mean' transport_regions (transport_reg) |S512 b'Global Ocean - Marginal Seas' ... Dimensions without coordinates: moc_comp, transport_reg Data variables: MOC (time, transport_reg, moc_comp, moc_z, lat_aux_grid) float64 0.0 ... moc = ds.MOC.isel(transport_reg=1,moc_comp=0) moc --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Users/Yefee/miniconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj) 670 type_pprinters=self.type_printers, 671 deferred_pprinters=self.deferred_printers) --> 672 printer.pretty(obj) 673 printer.flush() 674 return stream.getvalue() /Users/Yefee/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj) 381 if callable(meth): 382 return meth(obj, self, cycle) --> 383 return _default_pprint(obj, self, cycle) 384 finally: 385 self.end_group() /Users/Yefee/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _default_pprint(obj, p, cycle) 501 if _safe_getattr(klass, '__repr__', None) not in _baseclass_reprs: 502 # A user-provided repr. Find newlines and replace them with p.break_() --> 503 _repr_pprint(obj, p, cycle) 504 return 505 p.begin_group(1, '<') /Users/Yefee/miniconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle) 699 """"""A pprint that just redirects to the normal repr function."""""" 700 # Find newlines and replace them with p.break_() --> 701 output = repr(obj) 702 for idx,output_line in enumerate(output.splitlines()): 703 if idx: /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/common.py in __repr__(self) 152 153 def __repr__(self): --> 154 return formatting.array_repr(self) 155 156 def _iter(self): /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in array_repr(arr) 380 if hasattr(arr, 'coords'): 381 if arr.coords: --> 382 summary.append(repr(arr.coords)) 383 384 unindexed_dims_str = unindexed_dims_repr(arr.dims, arr.coords) /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in __repr__(self) 58 """"""Mixin that defines __repr__ for a class that already has __unicode__."""""" 59 def __repr__(self): ---> 60 return ensure_valid_repr(self.__unicode__()) 61 62 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/coordinates.py in __unicode__(self) 44 45 def __unicode__(self): ---> 46 return formatting.coords_repr(self) 47 48 @property /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in coords_repr(coords, col_width) 309 col_width = _calculate_col_width(_get_col_items(coords)) 310 return _mapping_repr(coords, title=u'Coordinates', --> 311 summarizer=summarize_coord, col_width=col_width) 312 313 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in _mapping_repr(mapping, title, summarizer, col_width) 291 summary = [u'%s:' % title] 292 if mapping: --> 293 summary += [summarizer(k, v, col_width) for k, v in mapping.items()] 294 else: 295 summary += [EMPTY_REPR] /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in (.0) 291 summary = [u'%s:' % title] 292 if mapping: --> 293 summary += [summarizer(k, v, col_width) for k, v in mapping.items()] 294 else: 295 summary += [EMPTY_REPR] /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in summarize_coord(name, var, col_width) 251 [_summarize_coord_multiindex(coord, col_width, marker), 252 _summarize_coord_levels(coord, col_width)]) --> 253 return _summarize_var_or_coord(name, var, col_width, show_values, marker) 254 255 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in _summarize_var_or_coord(name, var, col_width, show_values, marker, max_width) 205 front_str = u'%s%s%s ' % (first_col, dims_str, var.dtype) 206 if show_values: --> 207 values_str = format_array_flat(var, max_width - len(front_str)) 208 else: 209 values_str = u'...' /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in format_array_flat(items_ndarray, max_width) 178 # print at least one item 179 max_possibly_relevant = max(int(np.ceil(max_width / 2.0)), 1) --> 180 relevant_items = first_n_items(items_ndarray, max_possibly_relevant) 181 pprint_items = format_items(relevant_items) 182 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/formatting.py in first_n_items(x, n_desired) 86 if n_desired < x.size: 87 indexer = _get_indexer_at_least_n_items(x.shape, n_desired) ---> 88 x = x[indexer] 89 return np.asarray(x).flat[:n_desired] 90 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/dataarray.py in __getitem__(self, key) 467 else: 468 # orthogonal array indexing --> 469 return self.isel(**self._item_key_to_dict(key)) 470 471 def __setitem__(self, key, value): /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/dataarray.py in isel(self, drop, **indexers) 655 DataArray.sel 656 """""" --> 657 ds = self._to_temp_dataset().isel(drop=drop, **indexers) 658 return self._from_temp_dataset(ds) 659 /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/dataset.py in isel(self, drop, **indexers) 1117 for name, var in iteritems(self._variables): 1118 var_indexers = dict((k, v) for k, v in indexers if k in var.dims) -> 1119 new_var = var.isel(**var_indexers) 1120 if not (drop and name in var_indexers): 1121 variables[name] = new_var /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/variable.py in isel(self, **indexers) 545 if dim in indexers: 546 key[i] = indexers[dim] --> 547 return self[tuple(key)] 548 549 def squeeze(self, dim=None): /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/variable.py in __getitem__(self, key) 375 dims = tuple(dim for k, dim in zip(key, self.dims) 376 if not isinstance(k, (int, np.integer))) --> 377 values = self._indexable_data[key] 378 # orthogonal indexing should ensure the dimensionality is consistent 379 if hasattr(values, 'ndim'): /Users/Yefee/miniconda3/lib/python3.6/site-packages/xarray/core/indexing.py in __getitem__(self, key) 419 420 def __getitem__(self, key): --> 421 return type(self)(self.array[key]) 422 423 def __setitem__(self, key, value): TypeError: byte indices must be integers or slices, not tuple ``` But using copy method makes it work. ``` moc = ds.MOC.isel(transport_reg=1,moc_comp=0).copy() moc array([[[ 2.859555e-03, 2.859555e-03, ..., 3.184585e-06, -1.938138e-07], [ 7.209966e-01, 7.209966e-01, ..., 5.836686e-03, -2.183406e-07], ..., [ 0.000000e+00, 0.000000e+00, ..., 8.159353e-08, 8.159353e-08], [ 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]], [[ -4.233219e-03, -4.233219e-03, ..., -4.192515e-06, 2.099500e-07], [ 7.536786e-01, 7.536786e-01, ..., 4.770853e-03, 2.859786e-07], ..., [ 0.000000e+00, 0.000000e+00, ..., -1.668220e-07, -1.668220e-07], [ 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]], ..., [[ 1.523036e-03, 1.523036e-03, ..., -1.674448e-06, -3.071424e-08], [ 7.738025e-01, 7.738025e-01, ..., 2.440764e-03, 2.238331e-08], ..., [ 0.000000e+00, 0.000000e+00, ..., -4.054318e-08, -4.054318e-08], [ 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]], [[ 1.113985e-03, 1.113985e-03, ..., -2.358672e-06, -4.464008e-07], [ 6.900834e-01, 6.900834e-01, ..., 8.989943e-04, -5.926298e-07], ..., [ 0.000000e+00, 0.000000e+00, ..., -6.891400e-10, -6.891400e-10], [ 0.000000e+00, 0.000000e+00, ..., 0.000000e+00, 0.000000e+00]]]) Coordinates: * time (time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ... * lat_aux_grid (lat_aux_grid) float32 -80.2602 -78.7338 -77.2176 ... * moc_z (moc_z) float32 0.0 800.0 1644.05 2573.71 3627.36 ... moc_components |S13 b'Eulerian Mean' transport_regions |S54 b'Atlantic Ocean + Labrador Sea + GIN Sea + Arctic Ocean' ... Attributes: long_name: Meridional Overturning Circulation units: Sverdrups ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1374/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 154924822,MDU6SXNzdWUxNTQ5MjQ4MjI=,848,Decode time error in CESM POP output,6420873,closed,0,,,5,2016-05-15T19:18:50Z,2016-05-16T17:11:53Z,2016-05-16T17:11:53Z,NONE,,,,"Hi, all Recently, I found a error about time decoding. The .nc file is POP output. ``` Python In [29]: ds = xr.open_mfdataset('EXAMPLE_CASE.pop.h.0001-01.nc') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 ds = xr.open_mfdataset('EXAMPLE_CASE.pop.h.0001-01.nc') /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/backends/api.pyc in open_mfdataset(paths, chunks, concat_dim, preprocess, engine, lock, **kwargs) 300 lock = _default_lock(paths[0], engine) 301 datasets = [open_dataset(p, engine=engine, chunks=chunks or {}, lock=lock, --> 302 **kwargs) for p in paths] 303 file_objs = [ds._file_obj for ds in datasets] 304 /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/backends/api.pyc in open_dataset(filename_or_obj, group, decode_cf, mask_and_scale, decode_times, concat_characters, decode_coords, engine, chunks, lock, drop_variables) 225 lock = _default_lock(filename_or_obj, engine) 226 with close_on_error(store): --> 227 return maybe_decode_store(store, lock) 228 else: 229 if engine is not None and engine != 'scipy': /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/backends/api.pyc in maybe_decode_store(store, lock) 156 store, mask_and_scale=mask_and_scale, decode_times=decode_times, 157 concat_characters=concat_characters, decode_coords=decode_coords, --> 158 drop_variables=drop_variables) 159 160 if chunks is not None: /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf(obj, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables) 888 vars, attrs, coord_names = decode_cf_variables( 889 vars, attrs, concat_characters, mask_and_scale, decode_times, --> 890 decode_coords, drop_variables=drop_variables) 891 ds = Dataset(vars, attrs=attrs) 892 ds = ds.set_coords(coord_names.union(extra_coords)) /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf_variables(variables, attributes, concat_characters, mask_and_scale, decode_times, decode_coords, drop_variables) 823 new_vars[k] = decode_cf_variable( 824 v, concat_characters=concat, mask_and_scale=mask_and_scale, --> 825 decode_times=decode_times) 826 if decode_coords: 827 var_attrs = new_vars[k].attrs /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/conventions.pyc in decode_cf_variable(var, concat_characters, mask_and_scale, decode_times, decode_endianness) 764 units = pop_to(attributes, encoding, 'units') 765 calendar = pop_to(attributes, encoding, 'calendar') --> 766 data = DecodedCFDatetimeArray(data, units, calendar) 767 elif attributes['units'] in TIME_UNITS: 768 # timedelta /Users/Yefee/miniconda2/lib/python2.7/site-packages/xarray/conventions.pyc in __init__(self, array, units, calendar) 389 if not PY3: 390 msg += ' Full traceback:\n' + traceback.format_exc() --> 391 raise ValueError(msg) 392 else: 393 self._dtype = getattr(result, 'dtype', np.dtype('object')) ValueError: unable to decode time units u'days since 0000-01-01 00:00:00' with the default calendar. Try opening your dataset with decode_times=False.` ``` The actual time is: ``` 31 double time(time) ; 32 time:long_name = ""time"" ; 33 time:units = ""days since 0000-01-01 00:00:00"" ; 34 time:bounds = ""time_bound"" ; 35 time:calendar = ""noleap"" ; ``` I can set 'decode_times=False' to open the file but the time is not right. `````` Coordinates: transport_components (transport_comp) |S256 'Total' ... transport_regions (transport_reg) |S256 'Global Ocean - Marginal Seas' ... * time (time) float64 396.0``` Is there any suggestions? `````` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/848/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue