id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1732874789,I_kwDOAMm_X85nSZIl,7885,drop_indexes is reversed by assign_coords of unrelated coord,4502,closed,0,,,2,2023-05-30T19:35:48Z,2023-08-29T14:23:31Z,2023-08-29T14:23:31Z,NONE,,,,"### What happened?
I dropped an index on one coord, then later called assign_coords to change another unrelated coord.
I expected the index on the original coord to stay dropped.
### What did you expect to happen?
The index was silently created again.
### Minimal Complete Verifiable Example
```Python
import xarray
import numpy as np
ds = xarray.Dataset(
{'foo': (('x','y'), np.ones((3,5)))},
coords={'x': [1,2,3], 'y': [4,5,6,7,8]})
ds = ds.drop_indexes('x')
assert 'x' not in ds.indexes
ds = ds.assign_coords(y=ds.y+1)
assert 'x' not in ds.indexes # Fails
```
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
### Relevant log output
_No response_
### Anything else we need to know?
In general it would be nice if xarray made it easier to avoid indexes being automatically created.
E.g. right now, as far as I can tell there's no way to avoid an index being created when you construct a DataArray or Dataset with a coordinate of the same name as a dimension.
Admittedly I have a slightly niche use case -- I'm using xarray with wrapped JAX arrays, which can't be converted into pandas indexes. Indexes being (re-)created in these cases isn't just an inconvenience it actually causes a crash.
### Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.10.9 (main, Dec 7 2022, 13:47:07) [GCC 12.2.0]
python-bits: 64
OS: Linux
OS-release: 6.1.20-2rodete1-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.10.8
libnetcdf: 4.9.0
xarray: 999
pandas: 1.5.3
numpy: 1.24.2
scipy: 1.10.0
netCDF4: 1.6.2
pydap: None
h5netcdf: 1.1.0
h5py: 3.7.0
Nio: None
zarr: 2.13.6+ds
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.4
cfgrib: 0.9.10.3
iris: None
bottleneck: 1.3.5
dask: None
distributed: None
matplotlib: 3.6.3
cartopy: None
seaborn: None
numbagg: None
fsspec: 2022.11.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip3: None
conda: None
pytest: 7.2.1
mypy: None
IPython: 8.5.0
sphinx: 5.3.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7885/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1188262115,I_kwDOAMm_X85G03Dj,6429,FacetGrid padding goes very bad when cartopy projection specified,4502,closed,0,,,2,2022-03-31T15:26:26Z,2023-04-28T13:06:14Z,2023-04-28T13:06:14Z,NONE,,,,"### What happened?
When doing a faceted plot and specifying a projection (via e.g. `subplot_kws=dict(projection=ccrs.PlateCarree())`), the padding becomes very weird (often unusable) and strangely unstable to changes in `size` and `aspect`. For example this produces very bad results:
```python
data = xarray.DataArray(
dims=('lat', 'lon', 'row', 'col'),
data=np.ones((180, 360, 2, 2)),
coords={'lon': np.arange(360), 'lat': np.arange(-90, 90)}
)
xarray.plot.pcolormesh(
data,
row='row',
col='col',
size=5,
aspect=1.5,
subplot_kws=dict(projection=ccrs.PlateCarree()),
)
```

whereas if you change `size` from 5 to 4, you suddenly get a much better (although still not quite right) layout:

### What did you expect to happen?
I expected a layout closer to what you get if you comment out `subplot_kws=dict(projection=ccrs.PlateCarree()),` above:

### Minimal Complete Verifiable Example
```Python
import xarray
import cartopy.crs as ccrs
import numpy as np
data = xarray.DataArray(
dims=('lat', 'lon', 'row', 'col'),
data=np.ones((180, 360, 2, 2)),
coords={'lon': np.arange(360), 'lat': np.arange(-90, 90)}
)
xarray.plot.pcolormesh(
data,
row='row',
col='col',
size=5,
aspect=1.5,
subplot_kws=dict(projection=ccrs.PlateCarree()),
)
```
### Relevant log output
In the case where the layout isn't (as) broken, I see a warning:
`.../xarray/plot/facetgrid.py:394: UserWarning: Tight layout not applied. tight_layout cannot make axes width small enough to accommodate all axes decorations`
It seems that when tight_layout does manage to get applied, things really go wrong.
Perhaps at a minimum we could have a way to disable tight_layout? (which currently seems to be mandatory)
### Anything else we need to know?
_No response_
### Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.10 (stable, redacted, redacted)
[Clang google3-trunk (e5b1b9edb8b6f6cd926c2ba3e1ad1b6f767021d6)]
python-bits: 64
OS: Linux
OS-release: 4.15.0-smp-920.39.0.0
machine: x86_64
processor:
byteorder: little
LC_ALL: en_US.UTF-8
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: 4.6.1
xarray: 0.18.2
pandas: 1.1.5
numpy: 1.21.5
scipy: 1.2.1
netCDF4: 1.4.1
pydap: None
h5netcdf: 0.11.0
h5py: 3.2.1
Nio: None
zarr: 2.7.0
cftime: 1.0.0
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.3.4
cartopy: 0+unknown
seaborn: 0.11.1
numbagg: None
pint: None
setuptools: unknown
pip: None
conda: None
pytest: None
IPython: 3.2.3
sphinx: None","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6429/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1415592761,I_kwDOAMm_X85UYDs5,7189,"combine_by_coords allows one overlapping coordinate value, but not more than one",4502,open,0,,,1,2022-10-19T21:13:28Z,2022-10-20T16:40:57Z,,NONE,,,,"### What happened?
I expected combine_by_coords to explicitly reject all cases where the coordinates overlap, producing a ValueError in cases like the following with coords `[0, 1]` and `[1, 2]`:
```
a = xarray.DataArray(dims=('x',), data=np.ones((2,)), coords={'x': [0, 1]})
b = xarray.DataArray(dims=('x',), data=np.ones((2,)), coords={'x': [1, 2]})
xarray.combine_by_coords([a, b])
```
As well as cases with larger overlaps e.g. `[0, 1, 2]` and `[1, 2, 3]`.
### What did you expect to happen?
It fails to reject the case above with coordinates `[0, 1]` and `[1, 2]`, despite rejecting cases where the overlap is bigger (e.g. `[0, 1, 2]` and `[1, 2, 3]`). Instead it returns a DataArray with duplicate coordinates. (See example below).
### Minimal Complete Verifiable Example
```Python
# This overlap is caught, as expected
a = xarray.DataArray(dims=('x',), data=np.ones((3,)), coords={'x': [0, 1, 2]})
b = xarray.DataArray(dims=('x',), data=np.ones((3,)), coords={'x': [1, 2, 3]})
xarray.combine_by_coords([a, b])
=> ValueError: Resulting object does not have monotonic global indexes along dimension x
# This overlap is not caught
a = xarray.DataArray(dims=('x',), data=np.ones((2,)), coords={'x': [0, 1]})
b = xarray.DataArray(dims=('x',), data=np.ones((2,)), coords={'x': [1, 2]})
xarray.combine_by_coords([a, b])
=>
array([1., 1., 1., 1.])
Coordinates:
* x (x) int64 0 1 1 2
```
### MVCE confirmation
- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
### Relevant log output
_No response_
### Anything else we need to know?
As far as I can tell this happens because `indexes.is_monotonic_increasing or indexes.is_monotonic_decreasing` are not checking for strict monotonicity and allow consecutive values to be the same.
I assume it wasn't intentional to allow overlaps like this. If so, do you think anyone is depending on this (I'd hope not...) and would you take a PR to fix it to produce a ValueError in this case?
If this behaviour is intentional or relied upon, could we have an option to do a strict check instead?
Also for performance reasons I'd propose to do some extra upfront checks to catch index overlap (e.g. by checking for index overlap in `_infer_concat_order_from_coords`), rather than doing a potentially large concat and only detecting duplicates afterwards.
### Environment
INSTALLED VERSIONS
------------------
commit: None
python: 3.9.15 (stable, redacted, redacted)
[Clang google3-trunk (11897708c0229c92802e747564e7c34b722f045f)]
python-bits: 64
OS: Linux
OS-release: 5.18.16-1rodete1-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.1
libnetcdf: None
xarray: 2022.06.0
pandas: 1.1.5
numpy: 1.23.2
scipy: 1.8.1
netCDF4: None
pydap: None
h5netcdf: None
h5py: 3.2.1
Nio: None
zarr: 2.7.0
cftime: None
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.3.4
cartopy: None
seaborn: 0.11.2
numbagg: None
fsspec: 0.7.4
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: None
pip: None
conda: None
pytest: None
IPython: 3.2.3
sphinx: None
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7189/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1266308714,I_kwDOAMm_X85LelZq,6680,Datatype for a 'shape specification' of a Dataset / DataArray,4502,open,0,,,2,2022-06-09T15:25:36Z,2022-07-13T18:17:06Z,,NONE,,,,"### Is your feature request related to a problem?
Often with xarray I find myself having to create a template Dataset or DataArray with dummy data in it just to specify the dimensions/sizes/coordinates/variable names that are required in some situation.
### Describe the solution you'd like
It would be very useful to have a datatype that represents a shape specification (dimensions, sizes and coordinates) independently of the data so that we can do things like:
* Implement xarray equivalents of functions like `np.ones, np.zeros, np.random.normal(size=...)` that are given a shape specification which the return value should conform to. (I have some more sophisticated / less trivial examples of this too, functions which currently need to be given templates for the return value but only depend on the shape of the template)
* Test if two DataArrays / Datasets have the same shape
* Memoize or cache things based on shape (this implies the shape spec would need to be hashable)
* Make it easier to use xarray with libraries like tree / PyTree that can be used to flatten and unflatten a Dataset into its underlying arrays together with some specification of the shape of the data structure that can be used to unflatten it back again. (Right now I have to implement my own shape specification objects to do this)
* Manipulate shape specifications e.g. by adding or removing dimensions from them without having to manipulate dummy template data in slightly arbitrary ways (e.g. `template.isel(dim_to_be_dropped=0, drop=True)`) in order to do this.
### Describe alternatives you've considered
I realise that using lazy dask arrays largely removes the performance overhead of manipulating fake data, but (A) it still feels kinda ugly and adds boilerplate to construct the fake data, and (B) not everyone wants to depend on dask.
### Additional context
_No response_","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6680/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1074303184,I_kwDOAMm_X85ACJDQ,6053,A broadcasting sum for xarray.Dataset,4502,open,0,,,3,2021-12-08T11:24:21Z,2022-07-08T11:49:24Z,,NONE,,,,"I've found it useful to have a version of Dataset.sum which sums variables in a way that's consistent with what would happen if they were broadcast to the full Dataset dimensions.
The difference is in what it does with variables that don't contain some of the dimensions it's asked to sum over: standard sum just ignores the summation over these dimensions for these variables, whereas a broadcasting_sum will multiply the variable by the product of sizes the missing dimensions, like so:
```python
def broadcast_sum(dataset, dims):
def broadcast_sum_var(var):
present_sum_dims = [dim for dim in dims if dim in var.dims]
non_present_sum_dims = [dim for dim in dims if dim not in var.dims]
return var.sum(present_sum_dims) * np.prod([dataset.sizes[dim] for dim in non_present_sum_dims])
return dataset.map(broadcast_sum_var)
```
This is consistent with mathematical sum notation, where the sum doesn't become a no-op just because the summand doesn't reference the index being summed over. E.g.:
$\sum_{n=1}^N x = N x$
I've found it useful when you need to do some broadcasting operations across different variables after the sum, and you want the summation done in a way that's consistent with the broadcasting logic that will be applied later.
Would you be open to adding this, and if so any preference how? (A separate method, an option to .sum ?)","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6053/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1078656323,I_kwDOAMm_X85ASv1D,6075,Broadcasting doesn't respect scalar coordinates,4502,open,0,,,1,2021-12-13T15:18:22Z,2021-12-13T22:10:14Z,,NONE,,,,"Usually if I apply a broadcasting operation to two arrays, the result only includes values for coordinates present in both. A simple example:
```python
In [160]: data_array = xarray.DataArray(dims=(""x"",), data=[1,2,3], coords={""x"": [1,2,3]})
In [161]: data_array.sel(x=[2]) * data_array
Out[161]:
array([4])
Coordinates:
* x (x) int64 2
```
However if I do the same thing but select a scalar value for the x coordinate (.sel(x=1)) yielding a scalar coordinate for x:
```python
In [164]: data_array.sel(x=2)
Out[164]:
array(2)
Coordinates:
x int64 2
In [165]: data_array.sel(x=2) * data_array
Out[165]:
array([2, 4, 6])
Coordinates:
* x (x) int64 1 2 3
```
Here the result includes values at all the coordinates [1,2,3], all of which have been broadcast against the scalar value taken at coordinate 2. This doesn't seem correct in general -- values from different coordinates shouldn't be broadcast against eachother by default, even if one of them is a scalar.
I would expect this either to result in an error or warning, or to select only the corresponding value at the scalar coordinate in question to broadcast against, resulting in e.g.:
```python
array(4)
Coordinates:
x int64 2
```","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6075/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue