id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
945226829,MDExOlB1bGxSZXF1ZXN0NjkwNTg1ODI4,5607,Add option to pass callable assertion failure message generator,167164,open,0,,,10,2021-07-15T10:17:42Z,2022-10-12T20:03:32Z,,FIRST_TIME_CONTRIBUTOR,,0,pydata/xarray/pulls/5607,"It is nice to be able to write custom assertion error messages on failure sometimes. This allows that with the array comparison assertions, by allowing a `fail_func(a, b)` callable to be passed in to each assertion function.

Not tested yet, but I'm happy to add tests if this is something that would be appreciated.

- [ ] Tests added
- [ ] Passes `pre-commit run --all-files`
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
- [ ] New functions/methods are listed in `api.rst`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5607/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
60303760,MDU6SXNzdWU2MDMwMzc2MA==,364,pd.Grouper support?,167164,open,0,,,24,2015-03-09T06:25:14Z,2022-04-09T01:48:48Z,,NONE,,,,"In pandas, you can pas a `pandas.TimeGrouper` object to a `.groupby()` call, and it allows you to group by month, year, day, or other times, without manually creating a new index with those values first. It would be great if you could do this with `xray`, but at the moment, I get:

```
/usr/local/lib/python3.4/dist-packages/xray/core/groupby.py in __init__(self, obj, group, squeeze)
     66             if the dimension is squeezed out.
     67         """"""
---> 68         if group.ndim != 1:
     69             # TODO: remove this limitation?
     70             raise ValueError('`group` must be 1 dimensional')

AttributeError: 'TimeGrouper' object has no attribute 'ndim'
```

Not sure how this will work though, because pandas.TimeGrouper doesn't appear to work with multi-index dataframes yet anyway, so maybe there needs to be a feature request over there too, or maybe it's better to implement something from scratch...
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/364/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
429572364,MDU6SXNzdWU0Mjk1NzIzNjQ=,2868,"netCDF4: support for structured arrays as attribute values; serialize as ""compound types""",167164,open,0,,,3,2019-04-05T03:54:17Z,2022-04-07T15:23:25Z,,NONE,,,,"#### Code Sample, a copy-pastable example if possible

A ""Minimal, Complete and Verifiable Example"" will make it much easier for maintainers to help you:
http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports

```python
ds.attrs = dict(a=dict(b=2))
ds.to_netcdf(outfile)

...

~/miniconda3/envs/ana/lib/python3.6/site-packages/xarray/backends/api.py in check_attr(name, value)
    158                             'a string, an ndarray or a list/tuple of '
    159                             'numbers/strings for serialization to netCDF '
--> 160                             'files'.format(value))
    161 
    162     # Check attrs on the dataset itself

TypeError: Invalid value for attr: {'b': 2} must be a number, a string, an ndarray or a list/tuple of numbers/strings for serialization to netCDF files

```
#### Problem description

I'm not entirely sure if this should be possible, but it seems like it should be from this email: https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg10502.html

Nested attributes would be nice as a way to namespace metadata.

#### Expected Output

Netcdf with nested global attributes.

#### Output of ``xr.show_versions()``

<details>

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-16-lowlatency
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.UTF-8
LOCALE: en_AU.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.12.0
pandas: 0.24.2
numpy: 1.16.2
scipy: 1.2.1
netCDF4: 1.4.3.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudonetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: 3.0.3
cartopy: 0.17.0
seaborn: None
setuptools: 40.8.0
pip: 19.0.3
conda: None
pytest: 4.3.1
IPython: 7.3.0
sphinx: None


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2868/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
446933504,MDU6SXNzdWU0NDY5MzM1MDQ=,2979,Reading single grid cells from a multi-file netcdf dataset?,167164,open,0,,,1,2019-05-22T05:01:50Z,2019-05-23T16:15:54Z,,NONE,,,,"I have a multifile dataset made up of month-long 8-hourly netcdf datasets over nearly 30 years. The files are available from `ftp://ftp.ifremer.fr/ifremer/ww3/HINDCAST/GLOBAL/`, and I'm spcifically looking at e.g. `1990_CFSR/hs/ww3.199001_hs.nc` for each year and month. Each file is about 45Mb, for about 15Gb total.

I want to calculate some lognormal distribution parameters of the Hs variable at each grid point (actually, only a smallish subset of points, using a mask). However, if I load the data with `open_mfdataset` and try to read a single lat/lon grid cell, my computer tanks, and python gets killed due to running out of memory (I have 16Gb, but even if I only try to open 1 year of data - ~500Mb, python ends up using 27% of my memory).

Is there a way in xarray/dask to force dask to only read single sub-arrays at a time? I have tried using lat/lon chunking, e.g.

```python
mfdata_glob = '/home/nedcr/cr/data/wave/*1990*.nc'
global_ds = xr.open_mfdataset(
    mfdata_glob,
    chunks={'latitude': 1, 'longitude': 1})
```
but that doesn't seem to improve things.

Is there any way around this problem? I guess I could try using `preprocess=` to sub-select grid cells, and loop over that, but that seems like it would require opening and reading each file 317*720 times, which sounds like a recipe for a long wait.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2979/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue