id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
1975574237,I_kwDOAMm_X851wN7d,8409,Task graphs on `.map_blocks` with many chunks can be huge,5635139,closed,0,,,6,2023-11-03T07:14:45Z,2024-01-03T04:10:16Z,2024-01-03T04:10:16Z,MEMBER,,,,"### What happened?

I'm getting task graphs > 1GB, I think possibly because the full indexes are being included in every task?

### What did you expect to happen?

Only the relevant sections of the index would be included

### Minimal Complete Verifiable Example

```Python
da = xr.tutorial.load_dataset('air_temperature')

# Dropping the index doesn't generally matter that much...

len(cloudpickle.dumps(da.chunk(lat=1, lon=1)))
# 15569320

len(cloudpickle.dumps(da.chunk().drop_vars(da.indexes)))
# 15477313

# But with `.map_blocks`, it really matters — it's really big with the indexes, and the same size without:


len(cloudpickle.dumps(da.chunk(lat=1, lon=1).map_blocks(lambda x: x)))
# 79307120

len(cloudpickle.dumps(da.chunk(lat=1, lon=1).drop_vars(da.indexes).map_blocks(lambda x: x)))
# 16016173
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.
- [X] Recent environment — the issue occurs with the latest version of xarray and its dependencies.

### Relevant log output

_No response_

### Anything else we need to know?

_No response_

### Environment

<details>


INSTALLED VERSIONS
------------------
commit: None
python: 3.9.18 (main, Aug 24 2023, 21:19:58)
[Clang 14.0.3 (clang-1403.0.22.14.1)]
python-bits: 64
OS: Darwin
OS-release: 22.6.0
machine: arm64
processor: arm
byteorder: little
LC_ALL: en_US.UTF-8
LANG: None
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: None

xarray: 2023.10.1
pandas: 2.1.1
numpy: 1.26.1
scipy: 1.11.1
netCDF4: None
pydap: None
h5netcdf: 1.1.0
h5py: 3.8.0
Nio: None
zarr: 2.16.0
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
iris: None
bottleneck: 1.3.7
dask: 2023.5.0
distributed: 2023.5.0
matplotlib: 3.6.0
cartopy: None
seaborn: 0.12.2
numbagg: 0.6.0
fsspec: 2022.8.2
cupy: None
pint: 0.22
sparse: 0.14.0
flox: 0.7.2
numpy_groupies: 0.9.22
setuptools: 68.1.2
pip: 23.2.1
conda: None
pytest: 7.4.0
mypy: 1.6.1
IPython: 8.14.0
sphinx: 5.2.1


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8409/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
988158051,MDU6SXNzdWU5ODgxNTgwNTE=,5764,Implement __sizeof__ on objects?,5635139,open,0,,,6,2021-09-03T23:36:53Z,2023-12-19T18:23:08Z,,MEMBER,,,,"<!-- Please do a quick search of existing issues to make sure that this has not been asked before. -->

**Is your feature request related to a problem? Please describe.**
Currently `ds.nbytes` returns the size of the data.

But `sys.getsizeof(ds)` returns a very small number.

**Describe the solution you'd like**
If we implement `__sizeof__` on DataArrays & Datasets, this would work. 

I think that would be something like `ds.nbytes` + the size of the `ds` container, + maybe attrs if those aren't handled by `.nbytes`?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5764/reactions"", ""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,reopened,13221727,issue
866826033,MDU6SXNzdWU4NjY4MjYwMzM=,5215,"Add an Cumulative aggregation, similar to Rolling",5635139,closed,0,,,6,2021-04-24T19:59:49Z,2023-12-08T22:06:53Z,2023-12-08T22:06:53Z,MEMBER,,,,"<!-- Please do a quick search of existing issues to make sure that this has not been asked before. -->

**Is your feature request related to a problem? Please describe.**

Pandas has a [`.expanding` aggregation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.expanding.html), which is basically rolling with a full lookback. I often end up supplying rolling with the length of the dimension, and this is some nice sugar for that. 

**Describe the solution you'd like**
Basically the same as pandas — a `.expanding` method that returns an `Expanding` class, which implements the same methods as a `Rolling` class.

**Describe alternatives you've considered**
Some options:
– This
– Don't add anything, the sugar isn't worth the additional API.
– Go full out and write specialized expanding algos — which will be faster since they don't have to keep track of the window. But not that much faster, likely not worth the effort.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5215/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1878288525,PR_kwDOAMm_X85ZYos5,8139,Fix pandas' `interpolate(fill_value=)` error,5635139,closed,0,,,6,2023-09-02T02:41:45Z,2023-09-28T16:48:51Z,2023-09-04T18:05:14Z,MEMBER,,0,pydata/xarray/pulls/8139,"Pandas no longer has a `fill_value` parameter for `interpolate`.

Weirdly I wasn't getting this locally, on pandas 2.1.0, only in CI on https://github.com/pydata/xarray/actions/runs/6054400455/job/16431747966?pr=8138.

Removing it passes locally, let's see whether this works in CI

Would close #8125 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8139/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
967854972,MDExOlB1bGxSZXF1ZXN0NzEwMDA1NzY4,5694,Ask PRs to annotate tests,5635139,closed,0,,,6,2021-08-12T02:19:28Z,2023-09-28T16:46:19Z,2023-06-19T05:46:36Z,MEMBER,,0,pydata/xarray/pulls/5694,"<!-- Feel free to remove check-list items aren't relevant to your change -->

- [x] Passes `pre-commit run --all-files`
- [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`

As discussed https://github.com/pydata/xarray/pull/5690#issuecomment-897280353","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/5694/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
1874148181,I_kwDOAMm_X85vtTtV,8123,`.rolling_exp` arguments could be clearer,5635139,open,0,,,6,2023-08-30T18:09:04Z,2023-09-01T00:25:08Z,,MEMBER,,,,"### Is your feature request related to a problem?

Currently we call `.rolling_exp` like:

```
da.rolling_exp(date=20).mean()
```

`20` refers to a ""standard"" window type — broadly ""the same average distance as a simple rolling window. That works well, and matches the `.rolling(date=20).mean()` format. 

But we also have different window types, and this makes it a bit incongruent:

```
da.rolling_exp(date=0.5, window_type=""alpha"").mean()
```

...since the `window_type` is completely changing the meaning of the value we pass to the dimension argument. A bit like someone asking ""how many apples would you like to buy"", and replying ""5"", and then separately saying ""when I said 5, I meant 5 _tonnes_"".

### Describe the solution you'd like

One option would be:

```
.rolling_exp(dptr={""alpha"": 0.5})
```

We pass a dict if we want a non-standard window type — so the value is attached to its type.

We could still have the original form for `da.rolling_exp(date=20).mean()`.

### Describe alternatives you've considered

_No response_

### Additional context

(I realize I wrote this originally, all criticism directed at me! This is based on feedback from a colleague, which on reflection I agree with.)

Unless anyone disagrees, I'll try and do this soon-ish™","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/8123/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
729208432,MDExOlB1bGxSZXF1ZXN0NTA5NzM0NTM2,4540,numpy_groupies,5635139,closed,0,,,6,2020-10-26T03:37:19Z,2022-02-05T22:24:12Z,2021-10-24T00:18:52Z,MEMBER,,0,pydata/xarray/pulls/4540,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes https://github.com/pydata/xarray/issues/4473
 - [ ] Tests added
 - [x] Passes `isort . && black . && mypy . && flake8`
 - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst`
 - [ ] New functions/methods are listed in `api.rst`

Very early effort — I found this harder than I expected — I was trying to use the existing groupby infra, but think I maybe should start afresh. The result of the `numpy_groupies` operation is a fully formed array, whereas we're used to handling an iterable of results which need to be concat.

I also added some type signature / notes and I was going through the existing code; mostly for my own understanding

If anyone has any thoughts, feel free to comment — otherwise I'll resume this soon","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4540/reactions"", ""total_count"": 4, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 2, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
399164733,MDExOlB1bGxSZXF1ZXN0MjQ0NjU3NTk5,2674,Skipping variables in datasets that don't have the core dim,5635139,closed,0,,,6,2019-01-15T02:43:11Z,2021-05-13T22:02:19Z,2021-05-13T22:02:19Z,MEMBER,,0,pydata/xarray/pulls/2674,"ref https://github.com/pydata/xarray/pull/2650#issuecomment-454164295

This seems an ugly way of accomplishing the goal; any ideas for a better way of doing this? 

And stepping back, do others think a) it's helpful to skip variables in a dataset, and b) `apply_ufunc` should do this?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/2674/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
298421965,MDU6SXNzdWUyOTg0MjE5NjU=,1923,Local test failure in test_backends,5635139,closed,0,,,6,2018-02-19T22:53:37Z,2020-09-05T20:32:17Z,2020-09-05T20:32:17Z,MEMBER,,,,"I'm happy to debug this further but before I do, is this an issue people have seen before? I'm running tests on master and hit an issue very early on. 

FWIW I don't use netCDF, and don't think I've got that installed


#### Code Sample, a copy-pastable example if possible

```python
========================================================================== FAILURES ==========================================================================
_________________________________________________________ ScipyInMemoryDataTest.test_bytesio_pickle __________________________________________________________

self = <xarray.tests.test_backends.ScipyInMemoryDataTest testMethod=test_bytesio_pickle>

    @pytest.mark.skipif(PY2, reason='cannot pickle BytesIO on Python 2')
    def test_bytesio_pickle(self):
        data = Dataset({'foo': ('x', [1, 2, 3])})
        fobj = BytesIO(data.to_netcdf())
        with open_dataset(fobj, autoclose=self.autoclose) as ds:
>           unpickled = pickle.loads(pickle.dumps(ds))
E           TypeError: can't pickle _thread.lock objects

xarray/tests/test_backends.py:1384: TypeError
```
#### Problem description

[this should explain **why** the current behavior is a problem and why the expected output is a better solution.]

#### Expected Output

Skip or pass backends tests

#### Output of ``xr.show_versions()``

<details>
INSTALLED VERSIONS
------------------
commit: d00721a3560f57a1b9226c5dbf5bf3af0356619d
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

xarray: 0.7.0-38-g1005a9e  # not sure why this is tagged so early. I'm running on latest master
pandas: 0.22.0
numpy: 1.14.0
scipy: 1.0.0
netCDF4: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
bottleneck: 1.2.1
cyordereddict: None
dask: None
distributed: None
matplotlib: 2.1.2
cartopy: None
seaborn: 0.8.1
setuptools: 38.5.1
pip: 9.0.1
conda: None
pytest: 3.4.0
IPython: 6.2.1
sphinx: None
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1923/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
575088962,MDExOlB1bGxSZXF1ZXN0MzgzMzAwMjgw,3826,Allow ellipsis to be used in stack,5635139,closed,0,,,6,2020-03-04T02:21:21Z,2020-03-20T01:20:54Z,2020-03-19T22:55:09Z,MEMBER,,0,pydata/xarray/pulls/3826,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [x] Closes https://github.com/pydata/xarray/issues/3814
 - [x] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [x] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3826/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
577283480,MDExOlB1bGxSZXF1ZXN0Mzg1MTA3OTU4,3846,Doctests fixes,5635139,closed,0,,,6,2020-03-07T05:44:27Z,2020-03-10T14:03:05Z,2020-03-10T14:03:00Z,MEMBER,,0,pydata/xarray/pulls/3846,"<!-- Feel free to remove check-list items aren't relevant to your change -->

 - [ ] Closes #xxxx
 - [ ] Tests added
 - [x] Passes `isort -rc . && black . && mypy . && flake8`
 - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API

Starting to get some fixes in. 

It's going to be a long journey though. I think maybe we whitelist some files and move gradually through before whitelisting the whole library.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3846/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
485437811,MDU6SXNzdWU0ODU0Mzc4MTE=,3265,Sparse tests failing on master,5635139,closed,0,,,6,2019-08-26T20:34:21Z,2019-08-27T00:01:18Z,2019-08-27T00:01:07Z,MEMBER,,,,"https://dev.azure.com/xarray/xarray/_build/results?buildId=695

```python

=================================== FAILURES ===================================
_______________________ TestSparseVariable.test_unary_op _______________________

self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f0b21b70>

    def test_unary_op(self):
>       sparse.utils.assert_eq(-self.var.data, -self.data)
E       AttributeError: module 'sparse' has no attribute 'utils'

xarray/tests/test_sparse.py:285: AttributeError
___________________ TestSparseVariable.test_univariate_ufunc ___________________

self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24ebc2bb38>

    def test_univariate_ufunc(self):
>       sparse.utils.assert_eq(np.sin(self.data), xu.sin(self.var).data)
E       AttributeError: module 'sparse' has no attribute 'utils'

xarray/tests/test_sparse.py:290: AttributeError
___________________ TestSparseVariable.test_bivariate_ufunc ____________________

self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f02a7e10>

    def test_bivariate_ufunc(self):
>       sparse.utils.assert_eq(np.maximum(self.data, 0), xu.maximum(self.var, 0).data)
E       AttributeError: module 'sparse' has no attribute 'utils'

xarray/tests/test_sparse.py:293: AttributeError
________________________ TestSparseVariable.test_pickle ________________________

self = <xarray.tests.test_sparse.TestSparseVariable object at 0x7f24f04f2c50>

    def test_pickle(self):
        v1 = self.var
        v2 = pickle.loads(pickle.dumps(v1))
>       sparse.utils.assert_eq(v1.data, v2.data)
E       AttributeError: module 'sparse' has no attribute 'utils'

xarray/tests/test_sparse.py:307: AttributeError
```

Any ideas?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3265/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
457080809,MDExOlB1bGxSZXF1ZXN0Mjg4OTY1MzQ4,3029,Fix pandas-dev tests ,5635139,closed,0,,,6,2019-06-17T18:15:16Z,2019-06-28T15:31:33Z,2019-06-28T15:31:28Z,MEMBER,,0,pydata/xarray/pulls/3029,"Currently pandas-dev tests get 'stuck' on the conda install. The last instruction to run is the standard install:

```
$ if [[ ""$CONDA_ENV"" == ""docs"" ]]; then conda env create -n test_env --file doc/environment.yml; elif [[ ""$CONDA_ENV"" == ""lint"" ]]; then conda env create -n test_env --file ci/requirements-py37.yml; else conda env create -n test_env --file ci/requirements-$CONDA_ENV.yml; fi
```

And after installing the libraries, [it prints this and then stops](https://travis-ci.org/max-sixty/xarray/jobs/546491330):

```
Preparing transaction: - - done
Verifying transaction: | / \ | / - \ | / / done
Executing transaction: \ | / - \ | / - \ | / - \ | / - \ | / - \ | / / - \ | / - \ done
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
```

I'm not that familiar with conda. Anyone have any ideas as to why this would fail while the other builds would succeed?

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3029/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull
168901028,MDU6SXNzdWUxNjg5MDEwMjg=,934,"Should indexing be possible on 1D coords, even if not dims?",5635139,closed,0,,,6,2016-08-02T14:33:43Z,2019-01-27T06:49:52Z,2019-01-27T06:49:52Z,MEMBER,,,,"``` python
In [1]: arr = xr.DataArray(np.random.rand(4, 3),
    ...:    ...:                    [('time', pd.date_range('2000-01-01', periods=4)),
    ...:    ...:                     ('space', ['IA', 'IL', 'IN'])])
    ...:    ...: 

In [17]: arr.coords['space2'] = ('space', ['A','B','C'])

In [18]: arr
Out[18]: 
<xarray.DataArray (time: 4, space: 3)>
array([[ 0.05187049,  0.04743067,  0.90329666],
       [ 0.59482538,  0.71014366,  0.86588207],
       [ 0.51893157,  0.49442107,  0.10697737],
       [ 0.16068189,  0.60756757,  0.31935279]])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * space    (space) |S2 'IA' 'IL' 'IN'
    space2   (space) |S1 'A' 'B' 'C'
```

Now try to select on the space2 coord:

``` python
In [19]: arr.sel(space2='A')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-eae5e4b64758> in <module>()
----> 1 arr.sel(space2='A')

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in sel(self, method, tolerance, **indexers)
    601         """"""
    602         return self.isel(**indexing.remap_label_indexers(
--> 603             self, indexers, method=method, tolerance=tolerance))
    604 
    605     def isel_points(self, dim='points', **indexers):

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataarray.pyc in isel(self, **indexers)
    588         DataArray.sel
    589         """"""
--> 590         ds = self._to_temp_dataset().isel(**indexers)
    591         return self._from_temp_dataset(ds)
    592 

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/xarray/core/dataset.pyc in isel(self, **indexers)
    908         invalid = [k for k in indexers if k not in self.dims]
    909         if invalid:
--> 910             raise ValueError(""dimensions %r do not exist"" % invalid)
    911 
    912         # all indexers should be int, slice or np.ndarrays

ValueError: dimensions ['space2'] do not exist
```

Is there an easier way to do this? I couldn't think of anything...

CC @justinkuosixty
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/934/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue