id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
715168959,MDExOlB1bGxSZXF1ZXN0NDk4MTIzNTQ2,4489,Alignment with tolerance2,10563614,open,0,,,4,2020-10-05T21:17:53Z,2023-12-14T19:22:57Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/4489,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #2217
 - [x] Tests added
 - [ ] Passes `isort . && black . && mypy . && flake8`
 - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
 - [ ] New functions/methods are listed in `api.rst`

Reading #2217, I've implemented fast algorithms for union and intersection of arrays with numerical tolerance. 
This works fine in the ""normal"" case, when each array has all its values different (outside the tolerance) and the first and second arrays have some values in common within the tolerance. Conversely, the behavior is not well defined when one array has some values that are within the tolerance (or are equal) of each other. In this case, the behavior of union and intersection is not well defined anyway. The ""bad"" cases could be checked to raise an Exception (duplicate values within the tolerance), but this is not implemented yet.

I've also implemented a function to test index equality within the tolerance.

At last, the logic of xarray.align has been changed to deal with the tolerance. It was not possible to avoid some changes in align.

I'd appreciate some tests and code review, because there are certainly some rough corners I've not though about.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4489/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
1415430795,I_kwDOAMm_X85UXcKL,7188,efficiently set values in a xarray using dask,10563614,closed,0,,,1,2022-10-19T18:44:44Z,2023-11-06T06:07:08Z,2023-11-06T06:07:08Z,CONTRIBUTOR,,,,"### What is your issue?

I have a quite dataset (data) with three coords band=21, y = 5000, x=5000, and I want to set the value for a few bands in some points (x, y) given by a boolean dataset. The chunk size is band=1, y=16, x = 5000. My memory is 4Gb per worker and I've 4 workers, 1 thread per worker. The most compact form I found is this one:

band = dict(band=[17, 18, 19, 20])
data['somevar'].loc[band] = data['somevar'].loc[band].where(~points, some_complex_calculation)

points and some_complex_calculation are DataArray's with the same shape as data (in fact points is only a DataArray of x,y), they typically have a HighLevelGraph with 106 layers and 142610 keys from all layers. These datasets depend on data. data also has a HighLevelGraph with hundred layers.  I can not use ""compute()"", this blow up the memory, I want directly to use data.to_zarr to exploit the chunks.
Unfortunately, this calculation blocks the workers, which end up to be killed.

I tried many forms, and I found this one:

for b in [17, 18, 19, 20]:
     data['somevar'] = data['somevar'].where(~((snow.band == b) & ipoints), some_complex_calculation)

it works! but its is very inefficient and I found it difficult to read.

It seems that my objective is quite simple, set a few values in a large dataset at a given dimension, and this dimension is outer and has chunksize=1. It seems very easy from a C / Fortran perspective.

Do you have any suggestion how to peform such operations ?



","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7188/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,not_planned,13221727,issue
1388326248,I_kwDOAMm_X85SwC1o,7093,xarray allows several types for netcdf attributes. Is it expected ?,10563614,open,0,,,3,2022-09-27T20:20:46Z,2022-10-04T20:46:32Z,,CONTRIBUTOR,,,,"### What is your issue?

Xarray is permissive regarding the type of the attributes. If using a wrong type, the error reveals the valid types: For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

Using a non iterable type used to raise an Exception when reading the saved netcdf, but this is now solved with #7085

The pending question is whether it is valid to save netcdf attributes with type other than a string or not.
The following lines are working (in a notebook):

```python
xr.DataArray([1, 2, 3], attrs={'units': 1}, name='x').to_netcdf(""tmp.nc"")
!ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': np.nan}, name='x').to_netcdf(""tmp.nc"")
!ncdump tmp.nc

xr.DataArray([1, 2, 3], attrs={'units': ['xarray', 'is', 'very', 'permissive', ]}, name='x').to_netcdf(""tmp.nc"")
!ncdump tmp.nc
```
On the other hand, the following line raises an error:

```python
xr.DataArray([1, 2, 3], attrs={'units': None}, name='x').to_netcdf(""tmp.nc"")

```

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7093/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
1386596170,PR_kwDOAMm_X84_olQw,7085,solve a bug when the units attribute is not a string ,10563614,closed,0,,,2,2022-09-26T19:27:08Z,2022-09-28T19:13:11Z,2022-09-28T19:13:11Z,CONTRIBUTOR,,0,pydata/xarray/pulls/7085,"
<!-- Feel free to remove check-list items aren't relevant to your change -->

- [ ] Closes #xxxx
- [x] Tests added
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [ ] New functions/methods are listed in `api.rst`

We faced a sort of bug with a colleague of mine. It seems to be legal to set a numeric value to the units attributes in an xarray or a netcdf file. xarray accepts to save such an array to netcdf: xr.DataArray([1, 2, 3], attrs={'units': 1}, name='x').to_csv('tmp.nc'). Reading this netcdf file with xarray.open_dataset raises an error.

It is unlikely to have a scalar for the units, but at least it happened to us (the value was NaN) and this raised an exception very difficult to understand.

This raises an exception because ```""since"" in attrs[""units""]``` was called twice in xarray codebase (in coding_times.py and in conventions.py) without checking for the type of the attribute.

This PR solves this improbable bug
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7085/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
709795317,MDExOlB1bGxSZXF1ZXN0NDkzNzYxOTQy,4467,Tolerance,10563614,open,0,35968931,,1,2020-09-27T18:57:34Z,2022-06-09T14:50:17Z,,CONTRIBUTOR,,0,pydata/xarray/pulls/4467,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes #4465
 - [x] Tests added
 - [ ] Passes `isort . && black . && mypy . && flake8`
 - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
 - [ ] New functions/methods are listed in `api.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4467/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
705182835,MDExOlB1bGxSZXF1ZXN0NDg5OTU2NDQ2,4442,Fix DataArray.to_dataframe when the array has MultiIndex,10563614,closed,0,,,4,2020-09-20T20:45:12Z,2021-02-20T00:08:42Z,2021-02-20T00:08:42Z,CONTRIBUTOR,,0,pydata/xarray/pulls/4442,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [X] Closes #3008
 - [x] Tests added
 - [x] Passes `isort . && black . && mypy . && flake8`
 - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4442/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
657466413,MDU6SXNzdWU2NTc0NjY0MTM=,4228,to_dataframe: no valid index for a 0-dimensional object,10563614,closed,0,,,5,2020-07-15T15:58:43Z,2020-10-26T08:42:35Z,2020-10-26T08:42:35Z,CONTRIBUTOR,,,,"**What happened**: 
`xr.DataArray([1], coords=[('onecoord', [2])]).sel(onecoord=2).to_dataframe(name='name')` raise an exception `ValueError: no valid index for a 0-dimensional object`

**What you expected to happen**:

the same behavior as: `xr.DataArray([1], coords=[('onecoord', [2])]).to_dataframe(name='name')`

**Anything else we need to know?**:

I see that the array after the selection has no ""dims"" anymore, and this is what cause the error.  but it still has one ""coords"", this is confusing.  Is there any documentation about this difference ?

**Environment**:

<details>
INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jun  1 2020, 18:57:50) 
[GCC 7.5.0]
python-bits: 64
OS: Linux
OS-release: 4.19.0-9-amd64
machine: x86_64
processor: 
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.4

xarray: 0.15.1
pandas: 1.0.4
numpy: 1.18.5
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: 2.10.0
Nio: None
zarr: 2.4.0
cftime: 1.1.3
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.3.2
dask: 2.18.1
distributed: 2.18.0
matplotlib: 3.2.1
cartopy: None
seaborn: 0.10.1
numbagg: None
setuptools: 47.3.1.post20200616
pip: 20.1.1
conda: 4.8.3
pytest: 5.4.3
IPython: 7.15.0
sphinx: 3.1.1

</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4228/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
709503596,MDU6SXNzdWU3MDk1MDM1OTY=,4465,combine_by_coords could use allclose instead of equal to compare coordinates,10563614,open,0,,,4,2020-09-26T09:26:05Z,2020-09-26T21:30:35Z,,CONTRIBUTOR,,,,"<!-- Please do a quick search of existing issues to make sure that this has not been asked before. -->

**Is your feature request related to a problem? Please describe.**

When a coordinate in different dataset / netcdf files has slightly different values, combine_by_coords considers the coordinate are different and attempts a concatenation of the coordinates.

Concretely, I produce netcdf with (lat, lon, time) coordinates, annually. Apparently the lat is not the same in all the files (difference is 1e-14), which I suspect is due to different pyproj version used to produce the lon,lat grid. Reprocessing all the annual netcdf is not an option. When using open_mfdataset on these netcdf, the lat coordinate is concatenated which leads to a MemoryError in my case.

**Describe the solution you'd like**
Two options:
- add a coord_tolerance argument to xr.combine_by_coords and use np.allclose to compare the coordinates. In line 69 combine.py the comparison uses strict equality ""if not all(index.equals(indexes[0]) for index in indexes[1:]):"". This does not break the compatibility because coord_tolerance=0 should be the default.

- add an argument to explicity list the coordinates to NOT concatenate. I tried to play with the coords argument to solve my problem, but was not succesfull.

**Describe alternatives you've considered**

- I certainly could find a workaround for this specific case, but I often had issue with the magic in combine_by_coords, and imho adding more control by the user would be useful in general.

**Additional context**

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4465/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue