id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 2163675672,PR_kwDOAMm_X85obI_8,8803,missing chunkmanager: update error message,10194086,open,0,,,4,2024-03-01T15:48:00Z,2024-03-15T11:02:45Z,,MEMBER,,0,pydata/xarray/pulls/8803,"When dask is missing we get the following error message: ```python-traceback ValueError: unrecognized chunk manager dask - must be one of: [] ``` this could be confusing - the error message seems geared towards a typo in the requested manager. However, I think it's much more likely that a chunk manager is just not installed. I tried to update the error message - happy to get feedback.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8803/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 2105703882,I_kwDOAMm_X859gn3K,8679,Dataset.weighted along a dimension not on weights errors,10194086,open,0,,,2,2024-01-29T15:03:39Z,2024-02-04T11:24:54Z,,MEMBER,,,,"### What happened? `ds.weighted(weights).mean(dims)` errors when reducing over a dimension that is neither on the `weights` nor on the variable. ### What did you expect to happen? This used to work and was ""broken"" by #8606. However, we may want to fix this by ignoring (?) those data vars instead (#7027). ### Minimal Complete Verifiable Example ```Python import xarray as xr ds = xr.Dataset({""a"": ((""y"", ""x""), [[1, 2]]), ""scalar"": 1}) weights = xr.DataArray([1, 2], dims=""x"") ds.weighted(weights).mean(""y"") ``` ### MVCE confirmation - [ ] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [ ] Complete example — the example is self-contained, including all data and the text of any traceback. - [ ] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [ ] New issue — a search of GitHub Issues suggests this is not a duplicate. - [ ] Recent environment — the issue occurs with the latest version of xarray and its dependencies. ### Relevant log output ```Python ValueError Traceback (most recent call last) Cell In[1], line 6 3 ds = xr.Dataset({""a"": ((""y"", ""x""), [[1, 2]]), ""scalar"": 1}) 4 weights = xr.DataArray([1, 2], dims=""x"") ----> 6 ds.weighted(weights).mean(""y"") File ~/code/xarray/xarray/util/deprecation_helpers.py:115, in _deprecate_positional_args.._decorator..inner(*args, **kwargs) 111 kwargs.update({name: arg for name, arg in zip_args}) 113 return func(*args[:-n_extra_args], **kwargs) --> 115 return func(*args, **kwargs) File ~/code/xarray/xarray/core/weighted.py:497, in Weighted.mean(self, dim, skipna, keep_attrs) 489 @_deprecate_positional_args(""v2023.10.0"") 490 def mean( 491 self, (...) 495 keep_attrs: bool | None = None, 496 ) -> T_Xarray: --> 497 return self._implementation( 498 self._weighted_mean, dim=dim, skipna=skipna, keep_attrs=keep_attrs 499 ) File ~/code/xarray/xarray/core/weighted.py:558, in DatasetWeighted._implementation(self, func, dim, **kwargs) 555 def _implementation(self, func, dim, **kwargs) -> Dataset: 556 self._check_dim(dim) --> 558 return self.obj.map(func, dim=dim, **kwargs) File ~/code/xarray/xarray/core/dataset.py:6924, in Dataset.map(self, func, keep_attrs, args, **kwargs) 6922 if keep_attrs is None: 6923 keep_attrs = _get_keep_attrs(default=False) -> 6924 variables = { 6925 k: maybe_wrap_array(v, func(v, *args, **kwargs)) 6926 for k, v in self.data_vars.items() 6927 } 6928 if keep_attrs: 6929 for k, v in variables.items(): File ~/code/xarray/xarray/core/dataset.py:6925, in (.0) 6922 if keep_attrs is None: 6923 keep_attrs = _get_keep_attrs(default=False) 6924 variables = { -> 6925 k: maybe_wrap_array(v, func(v, *args, **kwargs)) 6926 for k, v in self.data_vars.items() 6927 } 6928 if keep_attrs: 6929 for k, v in variables.items(): File ~/code/xarray/xarray/core/weighted.py:286, in Weighted._weighted_mean(self, da, dim, skipna) 278 def _weighted_mean( 279 self, 280 da: T_DataArray, 281 dim: Dims = None, 282 skipna: bool | None = None, 283 ) -> T_DataArray: 284 """"""Reduce a DataArray by a weighted ``mean`` along some dimension(s)."""""" --> 286 weighted_sum = self._weighted_sum(da, dim=dim, skipna=skipna) 288 sum_of_weights = self._sum_of_weights(da, dim=dim) 290 return weighted_sum / sum_of_weights File ~/code/xarray/xarray/core/weighted.py:276, in Weighted._weighted_sum(self, da, dim, skipna) 268 def _weighted_sum( 269 self, 270 da: T_DataArray, 271 dim: Dims = None, 272 skipna: bool | None = None, 273 ) -> T_DataArray: 274 """"""Reduce a DataArray by a weighted ``sum`` along some dimension(s)."""""" --> 276 return self._reduce(da, self.weights, dim=dim, skipna=skipna) File ~/code/xarray/xarray/core/weighted.py:231, in Weighted._reduce(da, weights, dim, skipna) 227 da = da.fillna(0.0) 229 # `dot` does not broadcast arrays, so this avoids creating a large 230 # DataArray (if `weights` has additional dimensions) --> 231 return dot(da, weights, dim=dim) File ~/code/xarray/xarray/util/deprecation_helpers.py:140, in deprecate_dims..wrapper(*args, **kwargs) 132 emit_user_level_warning( 133 ""The `dims` argument has been renamed to `dim`, and will be removed "" 134 ""in the future. This renaming is taking place throughout xarray over the "" (...) 137 PendingDeprecationWarning, 138 ) 139 kwargs[""dim""] = kwargs.pop(""dims"") --> 140 return func(*args, **kwargs) File ~/code/xarray/xarray/core/computation.py:1885, in dot(dim, *arrays, **kwargs) 1883 dim = tuple(d for d, c in dim_counts.items() if c > 1) 1884 else: -> 1885 dim = parse_dims(dim, all_dims=tuple(all_dims)) 1887 dot_dims: set[Hashable] = set(dim) 1889 # dimensions to be parallelized File ~/code/xarray/xarray/core/utils.py:1046, in parse_dims(dim, all_dims, check_exists, replace_none) 1044 dim = (dim,) 1045 if check_exists: -> 1046 _check_dims(set(dim), set(all_dims)) 1047 return tuple(dim) File ~/code/xarray/xarray/core/utils.py:1131, in _check_dims(dim, all_dims) 1129 if wrong_dims: 1130 wrong_dims_str = "", "".join(f""'{d!s}'"" for d in wrong_dims) -> 1131 raise ValueError( 1132 f""Dimension(s) {wrong_dims_str} do not exist. Expected one or more of {all_dims}"" 1133 ) ValueError: Dimension(s) 'y' do not exist. Expected one or more of {'x'} ``` ### Anything else we need to know? _No response_ ### Environment Newest main (i.e. 2024.01) ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8679/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 748684119,MDU6SXNzdWU3NDg2ODQxMTk=,4601,Don't type check __getattr__?,10194086,open,0,,,8,2020-11-23T10:41:21Z,2023-09-25T05:33:09Z,,MEMBER,,,,"In #4592 I had the issue that mypy did not raise an error on a missing method: ```python from xarray.core.common import DataWithCoords hasattr(xr.core.common.DataWithCoords, ""reduce"") # -> False def test(x: ""DataWithCoords""): x.reduce() # mypy does not error ``` This is because `DataWithCoords` implements `__getattr__`: ```python class A: pass class B: def __getattr__(self, name): ... def testA(x: ""A""): x.reduce() # mypy errors def testB(x: ""B""): x.reduce() # mypy does not error ``` The solution seems to be to not typecheck `__getattr__` (see https://github.com/python/mypy/issues/6251#issuecomment-457287161): ```python from typing import no_type_check class C: @no_type_check def __getattr__(self, name): ... def testC(x: ""C""): x.reduce() # mypy errors ``` The only `__getattr__` within xarray is here: https://github.com/pydata/xarray/blob/17358922d480c038e66430735bf4c365a7677df8/xarray/core/common.py#L221 Using `@no_type_check` leads to 24 errors and not all of them can be trivially solved. E.g. `DataWithCoords` [wants of use](https://github.com/pydata/xarray/blob/17358922d480c038e66430735bf4c365a7677df8/xarray/core/common.py#L368) `self.isel` but does not implement the method. The solution is probably to add `isel` to `DataWithCoords` as an `ABC` or using `NotImplemented`. Thoughts? **All errors**
```python-traceback xarray/core/common.py:370: error: ""DataWithCoords"" has no attribute ""isel"" xarray/core/common.py:374: error: ""DataWithCoords"" has no attribute ""dims"" xarray/core/common.py:378: error: ""DataWithCoords"" has no attribute ""indexes"" xarray/core/common.py:381: error: ""DataWithCoords"" has no attribute ""sizes"" xarray/core/common.py:698: error: ""DataWithCoords"" has no attribute ""_groupby_cls"" xarray/core/common.py:761: error: ""DataWithCoords"" has no attribute ""_groupby_cls"" xarray/core/common.py:866: error: ""DataWithCoords"" has no attribute ""_rolling_cls""; maybe ""_rolling_exp_cls""? xarray/core/common.py:977: error: ""DataWithCoords"" has no attribute ""_coarsen_cls"" xarray/core/common.py:1108: error: ""DataWithCoords"" has no attribute ""dims"" xarray/core/common.py:1109: error: ""DataWithCoords"" has no attribute ""dims"" xarray/core/common.py:1133: error: ""DataWithCoords"" has no attribute ""indexes"" xarray/core/common.py:1144: error: ""DataWithCoords"" has no attribute ""_resample_cls""; maybe ""resample""? xarray/core/common.py:1261: error: ""DataWithCoords"" has no attribute ""isel"" xarray/core/alignment.py:278: error: ""DataAlignable"" has no attribute ""copy"" xarray/core/alignment.py:283: error: ""DataAlignable"" has no attribute ""dims"" xarray/core/alignment.py:286: error: ""DataAlignable"" has no attribute ""indexes"" xarray/core/alignment.py:288: error: ""DataAlignable"" has no attribute ""sizes"" xarray/core/alignment.py:348: error: ""DataAlignable"" has no attribute ""dims"" xarray/core/alignment.py:351: error: ""DataAlignable"" has no attribute ""copy"" xarray/core/alignment.py:353: error: ""DataAlignable"" has no attribute ""reindex"" xarray/core/alignment.py:356: error: ""DataAlignable"" has no attribute ""encoding"" xarray/core/weighted.py:157: error: ""DataArray"" has no attribute ""notnull"" xarray/core/dataset.py:3792: error: ""Dataset"" has no attribute ""virtual_variables"" xarray/core/dataset.py:6135: error: ""DataArray"" has no attribute ""isnull"" ```
Edit: one problem is certainly the method injection, as mypy cannot detect those types.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4601/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1371397741,I_kwDOAMm_X85Rvd5t,7027,"don't apply `weighted`, `groupby`, etc. to `DataArray` without `dims`?",10194086,open,0,,,1,2022-09-13T12:44:34Z,2023-08-26T19:13:39Z,,MEMBER,,,,"### What is your issue? Applying e.g. `ds.weighted(weights).mean()` applies the operation over all `DataArray` objects - even if they don't have the dimensions over which it is applied (or is a scalar variable). I don't think this is wanted. ```python import xarray as xr air = xr.tutorial.open_dataset(""air_temperature"") air.attrs = {} # add variable without dims air[""foo""] = 5 print(""resample"") print(air.resample(time=""MS"").mean(dim=""time"").foo.dims) print(""groupby"") print(air.groupby(""time.year"").mean(dim=""time"").foo.dims) print(""weighted"") print(air.weighted(weights=air.time.dt.year).mean(""lat"").foo.dims) print(""where"") print(air.where(air.air > 5).foo.dims) ``` Results ``` resample ('time',) groupby ('year',) weighted ('time',) ``` Related #6952 - I am sure there are other issues, but couldn't find them quickly... `rolling` and `coarsen` don't seem to do this. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7027/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1719805837,I_kwDOAMm_X85mgieN,7860,diff of cftime.Datetime,10194086,open,0,,,3,2023-05-22T14:21:06Z,2023-08-04T12:01:33Z,,MEMBER,,,,"### What happened? A cftime variable returns a timedelta64[ns] when calling `diff` / `+` / `-` and it can then not be added/ subtracted from the original data. ### What did you expect to happen? We can `add` cftime timedeltas. ### Minimal Complete Verifiable Example ```Python import xarray as xr air = xr.tutorial.open_dataset(""air_temperature"", use_cftime=True) air.time + air.time.diff(""time"") / 2 ``` ### MVCE confirmation - [x] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [x] Complete example — the example is self-contained, including all data and the text of any traceback. - [x] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [x] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python air.time.variable.values[1:] - air.time.variable.values[:-1] ``` returns `array([datetime.timedelta(seconds=21600), ...])` but then ```Python xr.Variable((""time"",), np.array([datetime.timedelta(0)])) ``` returns a `dtype='timedelta64[ns]'` array. ### Anything else we need to know? - See upstream PR: xarray-contrib/cf-xarray#441 - Similar to #7381 (but I don't think it's the same issue, feel free to close if you disagree) - That might need a special data type for timedeltas of cftime.Datetime objects, or allowing to add `'timedelta64[ns]'` to cftime.Datetime objects - The casting comes from https://github.com/pydata/xarray/blob/d8ec3a3f6b02a8b941b484b3d254537af84b5fde/xarray/core/variable.py#L366 https://github.com/pydata/xarray/blob/d8ec3a3f6b02a8b941b484b3d254537af84b5fde/xarray/core/variable.py#L272 ### Environment
INSTALLED VERSIONS ------------------ commit: d8ec3a3f6b02a8b941b484b3d254537af84b5fde python: 3.10.9 | packaged by conda-forge | (main, Feb 2 2023, 20:20:04) [GCC 11.3.0] python-bits: 64 OS: Linux OS-release: 5.14.21-150400.24.63-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: ('en_US', 'UTF-8') libhdf5: 1.12.2 libnetcdf: 4.9.1 xarray: 2023.2.1.dev20+g06a87062 pandas: 1.5.3 numpy: 1.23.5 scipy: 1.10.1 netCDF4: 1.6.2 pydap: installed h5netcdf: 1.1.0 h5py: 3.8.0 Nio: None zarr: 2.13.6 cftime: 1.6.2 nc_time_axis: 1.4.1 PseudoNetCDF: 3.2.2 iris: 3.4.1 bottleneck: 1.3.6 dask: 2023.2.1 distributed: 2023.2.1 matplotlib: 3.7.0 cartopy: 0.21.1 seaborn: 0.12.2 numbagg: 0.2.2 fsspec: 2023.1.0 cupy: None pint: 0.20.1 sparse: 0.14.0 flox: 0.6.8 numpy_groupies: 0.9.20 setuptools: 67.4.0 pip: 23.0.1 conda: None pytest: 7.2.1 mypy: None IPython: 8.11.0 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7860/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 594669577,MDU6SXNzdWU1OTQ2Njk1Nzc=,3937,"compose weighted with groupby, coarsen, resample, rolling etc.",10194086,open,0,,,7,2020-04-05T22:00:40Z,2023-07-27T18:10:10Z,,MEMBER,,,,"It would be nice to make `weighted` work with `groupby` - e.g. [#3935 (comment)](https://github.com/pydata/xarray/pull/3935/files#r403742055) However, it is not entirely clear to me how that should be done. One way would be to do: ```python da.groupby(...).weighted(weights).mean() ``` this would require that the `groupby` operation is applied over the `weights` (how would this be done?) Or should it be ```python da.weighted(weights).groupby(...).mean() ``` but this seems less intuitive to me. Or ```python da.groupby(..., weights=weights).mean() ``` ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3937/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1094725752,I_kwDOAMm_X85BQDB4,6142,dimensions: type as `str | Iterable[Hashable]`?,10194086,open,0,,,14,2022-01-05T20:39:00Z,2022-06-26T11:57:40Z,,MEMBER,,,,"### What happened? We generally type dimensions as: ```python dims: Hashable | Iterable[Hashable] ``` However, this is in conflict with passing a tuple of independent dimensions to a method - e.g. `da.mean((""x"", ""y""))` because a tuple is also hashable. Also mypy requires an `isinstance(dims, Hashable)` check when typing a function. We use an `isinstance(dims, str)` check in many places to wrap a single dimension in a list. Changing this to `isinstance(dims, Hashable)` will change the behavior for tuples. ### What did you expect to happen? In the community call today we discussed to change this to ```python dims: str | Iterable[Hashable] ``` i.e. if a single dim is passed it has to be a string and wrapping it in a list is a convenience function. Special use cases with `Hashable` types should be wrapped in a `Iterable` by the user. This probably best reflects the current state of the repo (`dims = [dims] if isinstance(dims, str) else dims`). The disadvantage could be that it is a bit more difficult to explain in the docstrings? @shoyer - did I get this right from the discussion? --- Other options 1. Require `str` as dimension names. This could be too restrictive. @keewis mentioned that tuple dimension names are already used somwehere in the xarray repo. Also we discussed in another issue or PR (which I cannot find right know) that we want to keep allowing `Hashable`. 2. Disallow passing tuples (only allow tuples if a dimension is a tuple), require lists to pass several dimensions. This is too restrictive in the other direction and will probably lead to a lot of downstream troubles. Naming a single dimension with a tuple will be a very rare case, in contrast to passing several dimension names as a tuple. 3. Special case tuples. We could potentially check if `dims` is a tuple and if there are any dimension names consisting of a tuple. Seems more complicated and potentially brittle for probably small gains (IMO). ### Minimal Complete Verifiable Example _No response_ ### Relevant log output _No response_ ### Anything else we need to know? * We need to check carefully where general `Hashable` are really allowed. E.g. `dims` of a `DataArray` are typed as https://github.com/pydata/xarray/blob/e056cacdca55cc9d9118c830ca622ea965ebcdef/xarray/core/dataarray.py#L380 but tuples are not actually allowed: ```python import xarray as xr xr.DataArray([1], dims=(""x"", ""y"")) # ValueError: different number of dimensions on data and dims: 1 vs 2 xr.DataArray([1], dims=[(""x"", ""y"")]) # TypeError: dimension ('x', 'y') is not a string ``` * We need to be careful typing functions where only one dim is allowed, e.g. `xr.concat`, which should probably set `dim: Hashable` (and make sure it works). * Do you have examples for other real-world hashable types except for `str` and `tuple`? (Would be good for testing purposes). ### Environment N/A","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6142/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 685739084,MDU6SXNzdWU2ODU3MzkwODQ=,4375,allow using non-dimension coordinates in polyfit,10194086,open,0,,,1,2020-08-25T19:40:55Z,2022-04-09T02:58:48Z,,MEMBER,,,," `polyfit` currently only allows to fit along a dimension and not along a non-dimension coordinate (or a virtual coordinate) Example: ```python da = xr.DataArray( [1, 3, 2], dims=[""x""], coords=dict(x=[""a"", ""b"", ""c""], y=(""x"", [0, 1, 2])) ) print(da) da.polyfit(""y"", 1) ``` Output: ```python array([1, 3, 2]) Coordinates: * x (x) in 5 print(da) 6 ----> 7 da.polyfit(""y"", 1) ~/.conda/envs/ipcc_ar6/lib/python3.7/site-packages/xarray/core/dataarray.py in polyfit(self, dim, deg, skipna, rcond, w, full, cov) 3507 """""" 3508 return self._to_temp_dataset().polyfit( -> 3509 dim, deg, skipna=skipna, rcond=rcond, w=w, full=full, cov=cov 3510 ) 3511 ~/.conda/envs/ipcc_ar6/lib/python3.7/site-packages/xarray/core/dataset.py in polyfit(self, dim, deg, skipna, rcond, w, full, cov) 6005 skipna_da = skipna 6006 -> 6007 x = get_clean_interp_index(self, dim, strict=False) 6008 xname = ""{}_"".format(self[dim].name) 6009 order = int(deg) + 1 ~/.conda/envs/ipcc_ar6/lib/python3.7/site-packages/xarray/core/missing.py in get_clean_interp_index(arr, dim, use_coordinate, strict) 246 247 if use_coordinate is True: --> 248 index = arr.get_index(dim) 249 250 else: # string ~/.conda/envs/ipcc_ar6/lib/python3.7/site-packages/xarray/core/common.py in get_index(self, key) 378 """""" 379 if key not in self.dims: --> 380 raise KeyError(key) 381 382 try: KeyError: 'y' ``` **Describe the solution you'd like** Would be nice if that worked. **Describe alternatives you've considered** One could just set the non-dimension coordinate as index, e.g.: `da = da.set_index(x=""y"")` **Additional context** Allowing this *may* be as easy as replacing https://github.com/pydata/xarray/blob/9c85dd5f792805bea319f01f08ee51b83bde0f3b/xarray/core/missing.py#L248 by ``` index = arr[dim] ``` but I might be missing something. Or probably a `use_coordinate` must be threaded through to `get_clean_interp_index` (although I am a bit confused by this argument). ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4375/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 310833761,MDU6SXNzdWUzMTA4MzM3NjE=,2037,to_netcdf -> _fill_value without NaN,10194086,open,0,,,8,2018-04-03T13:20:19Z,2022-03-10T10:59:17Z,,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python # Your code here import xarray as xr import numpy as np x = np.arange(10.) da = xr.Dataset(data_vars=dict(data=('dim1', x)), coords=dict(dim1=('dim1', x))) da.to_netcdf('tst.nc') ``` #### Problem description Apologies if this was discussed somwhere and it probably does not matter much, but `tst.nc` has `_FillValue` although it is not really necessary. #### Output of ``xr.show_versions()``
# Paste the output here xr.show_versions() here
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2037/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 1150251120,I_kwDOAMm_X85Ej3Bw,6304,add join argument to xr.broadcast?,10194086,open,0,,,1,2022-02-25T09:52:14Z,2022-02-25T21:50:16Z,,MEMBER,,,,"### Is your feature request related to a problem? `xr.broadcast` always does an outer join: https://github.com/pydata/xarray/blob/de965f342e1c9c5de92ab135fbc4062e21e72453/xarray/core/alignment.py#L702 https://github.com/pydata/xarray/blob/de965f342e1c9c5de92ab135fbc4062e21e72453/xarray/core/alignment.py#L768 This is not how the (default) broadcasting (arithmetic join) works, e.g. the following first does an inner join and then broadcasts: ```python import xarray as xr da1 = xr.DataArray([[0, 1, 2]], dims=(""y"", ""x""), coords={""x"": [0, 1, 2]}) da2 = xr.DataArray([0, 1, 2, 3, 4], dims=""x"", coords={""x"": [0, 1, 2, 3, 4]}) da1 + da2 ``` ``` array([[0, 2, 4]]) Coordinates: * x (x) int64 0 1 2 Dimensions without coordinates: y ``` ### Describe the solution you'd like Add a `join` argument to `xr.broadcast`. I would propose to leave the default as is ```python def broadcast(*args, exclude=None, join=""outer""): args = align(*args, join=join, copy=False, exclude=exclude) ``` ### Describe alternatives you've considered - We could make `broadcast` respect `options -> arithmetic_join` but that would be a breaking change and I am not sure how the deprecation should/ would be handled... - We could leave it as is. ### Additional context - `xr.broadcast` should not be used often because this is should happen automatically in most cases - in #6059 I use `broadcast` because I couldn't get it to work otherwise (maybe there is a better way?). However, the ""outer elements"" are immediately discarded again - so it's kind of pointless to do an outer join. ```python import numpy as np import xarray as xr da = xr.DataArray(np.arange(6).reshape(3, 2), coords={""dim_0"": [0, 1, 2]}) w = xr.DataArray([1, 1, 1, 1, 1, 1], coords={""dim_0"": [0, 1, 2, 4, 5, 6]}) da.weighted(w).quantile(0.5) ```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6304/reactions"", ""total_count"": 4, ""+1"": 4, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 307783090,MDU6SXNzdWUzMDc3ODMwOTA=,2007,rolling: allow control over padding,10194086,open,0,,,20,2018-03-22T19:27:07Z,2021-07-14T19:10:47Z,,MEMBER,,,,"#### Code Sample, a copy-pastable example if possible ```python import numpy as np import xarray as xr x = np.arange(1, 366) y = np.random.randn(365) ds = xr.DataArray(y, dims=dict(dayofyear=x)) ds.rolling(center=True, dayofyear=31).mean() ``` #### Problem description `rolling` cannot directly handle periodic boundary conditions (lon, dayofyear, ...), but could be very helpful to e.g. calculate climate indices. Also I cannot really think of an easy way to append the first elements to the end of the dataset and then calculate rolling. Is there a way to do this? Should xarray support this feature? This might also belong to SO... ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2007/reactions"", ""total_count"": 3, ""+1"": 3, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 788534915,MDU6SXNzdWU3ODg1MzQ5MTU=,4824,combine_by_coords can succed when it shouldn't,10194086,open,0,,,15,2021-01-18T20:39:29Z,2021-07-08T17:44:38Z,,MEMBER,,,,"**What happened**: `combine_by_coords` can succeed when it should not - depending on the name of the dimensions (which determines the order of operations in `combine_by_coords`). **What you expected to happen**: * I think it should throw an error in both cases. **Minimal Complete Verifiable Example**: ```python import numpy as np import xarray as xr data = np.arange(5).reshape(1, 5) x = np.arange(5) x_name = ""lat"" da0 = xr.DataArray(data, dims=(""t"", x_name), coords={""t"": [1], x_name: x}).to_dataset(name=""a"") x = x + 1e-6 da1 = xr.DataArray(data, dims=(""t"", x_name), coords={""t"": [2], x_name: x}).to_dataset(name=""a"") ds = xr.combine_by_coords((da0, da1)) ds ``` returns: ```python Dimensions: (lat: 10, t: 2) Coordinates: * lat (lat) float64 0.0 1e-06 1.0 1.0 2.0 2.0 3.0 3.0 4.0 4.0 * t (t) int64 1 2 Data variables: a (t, lat) float64 0.0 nan 1.0 nan 2.0 nan ... 2.0 nan 3.0 nan 4.0 ``` Thus lat is interlaced - it don't think `combine_by_coords` should do this. If you set ```python x_name = ""lat"" ``` and run the example again, it returns: ```python-traceback ValueError: Resulting object does not have monotonic global indexes along dimension x ``` **Anything else we need to know?**: * this is vaguely related to #4077 but I think it is separate * `combine_by_coords` concatenates over all dimensions where the coords are different - therefore `compat=""override""` doesn't actually do anything? Or does it? https://github.com/pydata/xarray/blob/ba42c08af9afbd9e79d47bda404bf4a92a7314a0/xarray/core/combine.py#L69 cc @dcherian @TomNicholas **Environment**:
Output of xr.show_versions()
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4824/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 773750763,MDU6SXNzdWU3NzM3NTA3NjM=,4727,xr.testing.assert_equal does not test for dtype,10194086,open,0,,,5,2020-12-23T13:14:41Z,2021-07-04T04:08:51Z,,MEMBER,,,,"In #4622 @toddrjen points out that `xr.testing.assert_equal` does not test for the `dtype`, only for the value. Therefore the following does not raise an error: ```python import numpy as np import xarray as xr import pandas as pd xr.testing.assert_equal( xr.DataArray(np.array(1, dtype=int)), xr.DataArray(np.array(1, dtype=float)) ) xr.testing.assert_equal( xr.DataArray(np.array(1, dtype=int)), xr.DataArray(np.array(1, dtype=object)) ) xr.testing.assert_equal( xr.DataArray(np.array(""a"", dtype=str)), xr.DataArray(np.array(""a"", dtype=object)) ) ``` This comes back to numpy, i.e. the following is True: ```python np.array(1, dtype=int) == np.array(1, dtype=float) ``` Depending on the situation one or the other is desirable or not. Thus, I would suggest to add a `check_dtype` argument to `xr.testing.assert_equal` and also to `DataArray.equals` (and `Dataset` and `Variable` and `identical`). I have not seen such an option in numpy, but pandas has it (e.g. `pd.testing.assert_series_equal(left, right, check_dtype=True, ...) `. I would _not_ change `__eq__`. * Thoughts? * What should the default be? We could try `True` first and see how many failures this creates? * What to do with coords and indexes? `pd.testing.assert_series_equal` has a `check_index_type` keyword. Probably we need `check_coords_type` as well? This makes the whole thing much more complicated... Also #4543 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4727/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 559217441,MDU6SXNzdWU1NTkyMTc0NDE=,3744,Contour with vmin/ vmax differs from matplotlib,10194086,open,0,,,0,2020-02-03T17:11:24Z,2021-07-04T02:03:02Z,,MEMBER,,,,"#### MCVE Code Sample ```python import numpy as np import xarray as xr import matplotlib as mpl import matplotlib.pyplot as plt data = xr.DataArray(np.arange(24).reshape(4, 6)) data.plot.contour(vmax=10, add_colorbar=True) ``` ![contour_xarray](https://user-images.githubusercontent.com/10194086/73672575-6d952980-46ad-11ea-8b3f-c78967c4776e.png) #### Expected Output ```python h = plt.contour(data.values, vmax=10) plt.colorbar(h) ``` ![contour_matplotlib](https://user-images.githubusercontent.com/10194086/73672722-adf4a780-46ad-11ea-8a44-4f5660ffe95c.png) #### Problem Description A `contour(vmax=vmax)` plot differs between xarray and matplotlib. I *think* the problem is here: https://github.com/pydata/xarray/blob/95e4f6c7a636878c94b892ee8d49866823d0748f/xarray/plot/utils.py#L265 xarray calculates the levels from `vmax` while matplotlib (probably) calculates the `levels` from `data.max()` and uses `vmax` only for the `norm`. For `contourf` and `pcolormesh` this is not so relevant as the capped values are then drawn with the `over` color. However, there may also be a good reason for this behavior. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: 4c96d53e6caa78d56b785f4edee49bbd4037a82f python: 3.7.6 | packaged by conda-forge | (default, Jan 7 2020, 22:33:48) [GCC 7.3.0] python-bits: 64 OS: Linux OS-release: 4.12.14-lp151.28.36-default machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_US.UTF-8 libhdf5: 1.10.5 libnetcdf: 4.6.2 xarray: 999 (master) pandas: 0.25.3 numpy: 1.17.3 scipy: 1.4.1 netCDF4: 1.5.1.2 pydap: installed h5netcdf: 0.7.4 h5py: 2.10.0 Nio: 1.5.5 zarr: 2.4.0 cftime: 1.0.4.2 nc_time_axis: 1.2.0 PseudoNetCDF: installed rasterio: 1.1.0 cfgrib: 0.9.7.6 iris: 2.2.0 bottleneck: 1.3.1 dask: 2.9.2 distributed: 2.9.2 matplotlib: 3.1.2 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: installed setuptools: 45.0.0.post20200113 pip: 19.3.1 conda: None pytest: 5.3.3 IPython: 7.11.1 sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3744/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 587048587,MDU6SXNzdWU1ODcwNDg1ODc=,3883,weighted operations: performance optimisations,10194086,open,0,,,3,2020-03-24T15:31:54Z,2021-07-04T02:01:28Z,,MEMBER,,,,"There was a discussion on the performance of the weighted mean/ sum in terms of memory footprint but also speed, and there may indeed be some things that can be optimized. See the [posts at the end of the PR](https://github.com/pydata/xarray/pull/2922#issuecomment-601496897). However, the optimal implementation will probably depend on the use case and some profiling will be required. I'll just open an issue to keep track of this. @seth-p ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3883/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 806218687,MDU6SXNzdWU4MDYyMTg2ODc=,4892,disallow boolean coordinates?,10194086,open,0,,,2,2021-02-11T09:33:17Z,2021-03-31T10:30:49Z,,MEMBER,,,,"Today I stumbled over a small pitfall, which I think could be avoided: I am working with arrays that have axes labeled with categorical values and I ended up using True/False as labels for some binary categories: ```python test = xarray.DataArray( numpy.ones((3,2)), dims=[""binary"",""ternary""], coords={""ternary"":[3,7,9],""binary"":[False,True]} ) ``` now came the big surprise, when I wanted to reduce over selections of the data: ``` test.sel(ternary=[9,3,7]) # does exactly what I expect and gives me the correctly permuted 3x2 array test.sel(binary=[True,False]) # does not do what I expect ``` Instead of using the coordinate values like with the ternary category, it uses the list as boolean mask and hence I get a 3x1 array at the binary=False coordinate. I assume that this behavior is reasonable in most cases - And I for sure will stop using bools as binary category labels. That said in the above case the conceptually identical call results in completely different outcome. My (radical) proposal would be: forbid binary coordinates in general to avoid such confusion. Curious about your thoughts! Hth, Marti _Originally posted by @martinitus in https://github.com/pydata/xarray/discussions/4861_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4892/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 683777199,MDU6SXNzdWU2ODM3NzcxOTk=,4364,plt.pcolormesh will infer interval breaks per default,10194086,open,0,,,3,2020-08-21T19:15:57Z,2021-03-19T14:09:52Z,,MEMBER,,,,"Looking at some warnings in #3266 I saw that matplotlib will deprecate the old behaviour of `pcolormesh` when the shape of the data and the coordinates are equal (they silently cut a row and a column of the data). With the new behaviour they will interpolate the coordinates. ```python import numpy as np import matplotlib.pyplot as plt x = np.array([1, 2, 3]) y = np.array([1, 2, 3, 4, 5]) data = np.random.randn(*y.shape + x.shape) f, axes = plt.subplots(1, 2) for ax, shading, behavior in zip(axes, [""flat"", ""nearest""], [""old"", ""new""]): ax.pcolormesh(x, y, data, shading=shading, vmin=-0.75, vmax=0.75) ax.set_title(f""{behavior}: shading='{shading}'"") ``` ![shading_old_new](https://user-images.githubusercontent.com/10194086/90925968-71429080-e3f2-11ea-8ad5-4249b94faf96.png) This is a good thing in general - we already do this for a long time with the `infer_intervals` keyword. Unfortunately they don't check if the data is monotonic (matplotlib/matplotlib#18317) which can lead to problems for maps (scitools/cartopy#1638). I don't think there is a need to do something right now - let's see what they think upstream. This change was introduced in mpl 3.3.0 ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4364/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 802992417,MDU6SXNzdWU4MDI5OTI0MTc=,4875,assigning values with incompatible dtype,10194086,open,0,,,0,2021-02-07T16:28:24Z,2021-02-07T16:28:24Z,,MEMBER,,,,"The behavior of xarray when assigning values with incompatible dtypes is a bit arbitrary. This is partly due to the behavior of numpy.... numpy 1.20 got a bit cleverer but still seems inconsistent at times... I am not sure what to do about this (and if we should actually be clever here). 1. Direct assignment (dupe of #4612) ```python import xarray as xr import numpy as np arr = np.array([2]) arr[0] = np.nan # ValueError (since numpy 1.20) arr[0:1] = np.array([np.nan]) # -> array([-9223372036854775808]) da = xr.DataArray([5], dims=""x"") da[0] = np.nan # # array([-9223372036854775808]) # Dimensions without coordinates: x (because this gets converted to da.variable._data[0:1, 0:1] = np.array([np.nan]) (approximately). da[0] = 1.2345 # casts constant_values to int ``` 2. Via a numpy function (`pad`, `shift`, `rolling`) **pad** ```python da.pad(x=1, constant_values=np.nan) # ValueError: cannot convert float NaN to integer da.pad(x=1, constant_values=None) # casts da to float da.pad(x=1, constant_values=1.5) # casts constant_values to int ```
**shift** ```python da.shift(x=1, fill_value=np.nan) # ValueError: cannot convert float NaN to integer # da.shift(x=1, fill_value=None) # None not allowed by shift da.shift(x=1, fill_value=1.5) # casts fill_value to int ``` **rolling** ```python da.rolling(x=1).construct(""new_axis"", stride=3, fill_value=np.nan) # ValueError: cannot convert float NaN to integer # da.rolling(x=1).construct(""new_axis"", stride=3, fill_value=None) # None not allowed by rolling da.rolling(x=3).construct(""new_axis"", stride=3, fill_value=1.5) # casts fill_value to int ```
--- To check: * What does dask do in these cases? * What does pandas do? * What about `str` dtypes? ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4875/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue