id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
163267018,MDU6SXNzdWUxNjMyNjcwMTg=,893,'Warm start' for open_mfdataset?,743508,closed,0,,,3,2016-06-30T21:05:46Z,2023-05-29T13:35:32Z,2023-05-29T13:35:32Z,CONTRIBUTOR,,,,"I'm using xarray in ipython to do interactive/exploratory analysis on large multi-file datasets. To avoid having too many files open, I'm wrapping my file-open code in a `with` block. However, this means that every time I re-run the code the multi-file dataset is re-initialised, causing xarray to re-scan every input datafile to construct the Dataset.

It would be good to have some kind of 'warm start' or caching mechanism to make it easier to re-open multifile datasets without having to re-scan the input files, but equally without having to keep the dataset open which keeps all the file handles open (I've hit the OS max file limit because of this).

Not sure what API would suit this - since it while being a useful usecase it's also a bit wierd. Something like `open_cached_mfdataset` which closes input files after initialisation but caches the information collected and simply assumes that files don't move or change between accesses. 
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/893/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
1445486904,I_kwDOAMm_X85WKGE4,7280,Support for Scipy Sparse Arrays,743508,open,0,,,4,2022-11-11T13:35:51Z,2022-11-11T16:39:53Z,,CONTRIBUTOR,,,,"### What happened?

Now that Scipy is moving to support sparse NDarrays, we would expect that Xarray should work with them as any other array like data.

### What did you expect to happen?

Doesn't work. It seems that why trying to use a scipy sparse array as the data, Xarray wraps the the sparse array in a 0-D dense array. (there are likely more issues after this but this was the first hurdle)

With sparse array s:
```
print(s)
<4x4 sparse array of type '<class 'numpy.float64'>'
	with 4 stored elements in COOrdinate format>
```
```
print(xr.DataArray(s).data)
array(<4x4 sparse array of type '<class 'numpy.float64'>'
	with 4 stored elements in COOrdinate format>, dtype=object)
```

### Minimal Complete Verifiable Example

```Python
import numpy as np
import xarray as xr
from scipy.sparse import coo_array

row  = np.array([0, 3, 1, 0])

col  = np.array([0, 3, 1, 2])

data = np.array([4, 5.4, 7, 9.2])

s= coo_array((data, (row, col)), shape=(4, 4))
da = xr.DataArray(s)
print(da._repr_html_())
```


### MVCE confirmation

- [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- [X] Complete example — the example is self-contained, including all data and the text of any traceback.
- [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result.
- [X] New issue — a search of GitHub Issues suggests this is not a duplicate.

### Relevant log output

```Python
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [4], in <cell line: 13>()
     11 s= coo_array((data, (row, col)), shape=(4, 4))
     12 da = xr.DataArray(s)
---> 13 print(da._repr_html_())

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/common.py:167, in AbstractArray._repr_html_(self)
    165 if OPTIONS[""display_style""] == ""text"":
    166     return f""<pre>{escape(repr(self))}</pre>""
--> 167 return formatting_html.array_repr(self)

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting_html.py:311, in array_repr(arr)
    303 arr_name = f""'{arr.name}'"" if getattr(arr, ""name"", None) else """"
    305 header_components = [
    306     f""<div class='xr-obj-type'>{obj_type}</div>"",
    307     f""<div class='xr-array-name'>{arr_name}</div>"",
    308     format_dims(dims, indexed_dims),
    309 ]
--> 311 sections = [array_section(arr)]
    313 if hasattr(arr, ""coords""):
    314     sections.append(coord_section(arr.coords))

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting_html.py:219, in array_section(obj)
    213 collapsed = (
    214     ""checked""
    215     if _get_boolean_with_default(""display_expand_data"", default=True)
    216     else """"
    217 )
    218 variable = getattr(obj, ""variable"", obj)
--> 219 preview = escape(inline_variable_array_repr(variable, max_width=70))
    220 data_repr = short_data_repr_html(obj)
    221 data_icon = _icon(""icon-database"")

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting.py:274, in inline_variable_array_repr(var, max_width)
    272     return var._data._repr_inline_(max_width)
    273 if var._in_memory:
--> 274     return format_array_flat(var, max_width)
    275 dask_array_type = array_type(""dask"")
    276 if isinstance(var._data, dask_array_type):

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting.py:191, in format_array_flat(array, max_width)
    188 # every item will take up at least two characters, but we always want to
    189 # print at least first and last items
    190 max_possibly_relevant = min(max(array.size, 1), max(math.ceil(max_width / 2.0), 2))
--> 191 relevant_front_items = format_items(
    192     first_n_items(array, (max_possibly_relevant + 1) // 2)
    193 )
    194 relevant_back_items = format_items(last_n_items(array, max_possibly_relevant // 2))
    195 # interleave relevant front and back items:
    196 #     [a, b, c] and [y, z] -> [a, z, b, y, c]

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting.py:180, in format_items(x)
    177     elif np.logical_not(time_needed).all():
    178         timedelta_format = ""date""
--> 180 formatted = [format_item(xi, timedelta_format) for xi in x]
    181 return formatted

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting.py:180, in <listcomp>(.0)
    177     elif np.logical_not(time_needed).all():
    178         timedelta_format = ""date""
--> 180 formatted = [format_item(xi, timedelta_format) for xi in x]
    181 return formatted

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/xarray/core/formatting.py:161, in format_item(x, timedelta_format, quote_strings)
    159     return repr(x) if quote_strings else x
    160 elif hasattr(x, ""dtype"") and np.issubdtype(x.dtype, np.floating):
--> 161     return f""{x.item():.4}""
    162 else:
    163     return str(x)

File ~/Scratch/.conda/envs/tessa-1/lib/python3.10/site-packages/scipy/sparse/_base.py:771, in spmatrix.__getattr__(self, attr)
    769     return self.getnnz()
    770 else:
--> 771     raise AttributeError(attr + "" not found"")

AttributeError: item not found
```


### Anything else we need to know?

_No response_

### Environment

<details>


INSTALLED VERSIONS
------------------
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.13.0-41-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1

xarray: 2022.11.0
pandas: 1.4.3
numpy: 1.22.4
scipy: 1.9.0
netCDF4: 1.6.0
pydap: None
h5netcdf: None
h5py: 3.7.0
Nio: None
zarr: 2.12.0
cftime: 1.6.1
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.3.2
cfgrib: 0.9.10.1
iris: None
bottleneck: 1.3.5
dask: 2022.8.1
distributed: 2022.8.1
matplotlib: 3.5.3
cartopy: 0.20.3
seaborn: 0.11.2
numbagg: None
fsspec: 2022.7.1
cupy: None
pint: 0.19.2
sparse: 0.13.0
flox: None
numpy_groupies: None
setuptools: 65.2.0
pip: 22.2.2
conda: 4.14.0
pytest: 7.1.2
IPython: 8.4.0
sphinx: None


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/7280/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
561539035,MDU6SXNzdWU1NjE1MzkwMzU=,3761,to_dataframe fails if dataarray has dimension 1,743508,open,0,,,2,2020-02-07T10:05:47Z,2020-02-07T16:37:05Z,,CONTRIBUTOR,,,,"The `to_dataframe` method fails with ValueError if the dataarray has only value

#### MCVE Code Sample

```python
# Your code here
x = np.arange(10)
y = np.arange(10)

data = np.zeros((len(x), len(y)))

da = xr.DataArray(data, coords=[x, y], dims=['x', 'y'])

da.sel(x=1,y=1).to_dataframe(name='test')
```

#### Expected Output

Expect a dataframe with one row

#### Problem Description

This happened when selecting a single value out of a gridded dataset - in cases where there was only one value output the to_dataframe failed.

#### Output of ``xr.show_versions()``
<details>
# Paste the output here xr.show_versions() here

INSTALLED VERSIONS
------------------
commit: None
python: 3.7.6 | packaged by conda-forge | (default, Jan  7 2020, 22:33:48) 
[GCC 7.3.0]
python-bits: 64
OS: Linux
OS-release: 5.3.0-28-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
libhdf5: 1.10.5
libnetcdf: 4.7.1

xarray: 0.14.1
pandas: 0.25.3
numpy: 1.17.5
scipy: 1.4.1
netCDF4: 1.5.3
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.4.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.1.0
cfgrib: 0.9.7.6
iris: None
bottleneck: 1.3.1
dask: 2.9.2
distributed: 2.9.3
matplotlib: 3.1.2
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 45.1.0.post20200119
pip: 20.0.1
conda: None
pytest: None
IPython: 7.11.1
sphinx: None


</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3761/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
231061878,MDU6SXNzdWUyMzEwNjE4Nzg=,1424,Huge memory use when using FacetGrid,743508,closed,0,,,6,2017-05-24T14:35:16Z,2019-06-29T02:58:33Z,2019-06-29T02:58:33Z,CONTRIBUTOR,,,,"When plotting a time series of maps using faceting, my memory use jumps by  over 3x, from about 4GB to 14GB.

Using macOS, Python 3.6, xarray 0.9.5, jupyter notebook.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1424/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
161068483,MDU6SXNzdWUxNjEwNjg0ODM=,887,Perf: use Scipy engine by default for netcdf3?,743508,closed,0,,,2,2016-06-19T11:27:56Z,2019-02-26T12:51:17Z,2019-02-26T12:51:17Z,CONTRIBUTOR,,,,"Not really a bug, but I'm finding that the scipy backend is considerably faster than the netCDF backend for netCDF 3 files (using dataset: http://rda.ucar.edu/datasets/ds093.1/). Using Anaconda python with MKL. Not sure if this is always faster, but if it is perhaps xarray should default to scipy backend for netCDF 3 files?
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/887/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
238990919,MDU6SXNzdWUyMzg5OTA5MTk=,1467,CF conventions for time doesn't support years,743508,open,0,,,10,2017-06-27T21:38:32Z,2019-02-20T21:25:01Z,,CONTRIBUTOR,,,,"CF conventions code supports: `{'microseconds': 'us', 'milliseconds': 'ms', 'seconds': 's', 'minutes': 'm', 'hours': 'h', 'days': 'D'}`, but not 'years'. See example file https://www.dropbox.com/s/34dcpliko928yaj/histsoc_population_0.5deg_1861-2005.nc4?dl=0


","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1467/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
157886730,MDU6SXNzdWUxNTc4ODY3MzA=,864,TypeError: invalid type promotion when reading multi-file dataset,743508,closed,0,,,3,2016-06-01T11:44:49Z,2019-01-27T21:54:49Z,2019-01-27T21:54:49Z,CONTRIBUTOR,,,,"I'm trying to select data from a collection of weather files. Xarray opens the multifile dataset perfectly, but when I try the following selection:

``` python

cfsr_new = xr.open_mfdataset('*.grb2.nc')

lon_sel = np.array(cfsr_new.lon[np.array([3, 4, 8])])
lat_sel = np.array(cfsr_new.lat[np.array([2, 3, 4])])
time_sel = cfsr_new.time[100:200]

selection = cfsr_new.sel(lon=lon_sel, lat=lat_sel, time=time_sel)
selection.to_array()

```

I get:

```
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-38-3f04c6458da2> in <module>()
----> 1 selection.to_array()

/Users/<user>/anaconda/lib/python3.5/site-packages/xarray/core/dataset.py in to_array(self, dim, name)
   1847         data_vars = [self.variables[k] for k in self.data_vars]
   1848         broadcast_vars = broadcast_variables(*data_vars)
-> 1849         data = ops.stack([b.data for b in broadcast_vars], axis=0)
   1850 
   1851         coords = dict(self.coords)

/Users/<user>//anaconda/lib/python3.5/site-packages/xarray/core/ops.py in f(*args, **kwargs)
     65             else:
     66                 module = eager_module
---> 67             return getattr(module, name)(*args, **kwargs)
     68     else:
     69         def f(data, *args, **kwargs):

/Users/<user>//anaconda/lib/python3.5/site-packages/dask/array/core.py in stack(seq, axis)
   1754 
   1755     if all(a._dtype is not None for a in seq):
-> 1756         dt = reduce(np.promote_types, [a._dtype for a in seq])
   1757     else:
   1758         dt = None

/Users/<user>//anaconda/lib/python3.5/site-packages/toolz/functoolz.py in __call__(self, *args, **kwargs)
    217     def __call__(self, *args, **kwargs):
    218         try:
--> 219             return self._partial(*args, **kwargs)
    220         except TypeError:
    221             # If there was a genuine TypeError

TypeError: invalid type promotion


```
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/864/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
142675134,MDU6SXNzdWUxNDI2NzUxMzQ=,799,Support for pathlib.Path,743508,closed,0,,,2,2016-03-22T14:53:48Z,2017-09-01T15:31:52Z,2017-09-01T15:31:52Z,CONTRIBUTOR,,,,"`pathlib.Path` IMHO is one of the best additions to Python. Would be nice if it were possible to open files from `Path` without having to cast to `str`
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/799/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
195050684,MDU6SXNzdWUxOTUwNTA2ODQ=,1161,Generated Dask graph is huge - performance issue?,743508,closed,0,,,8,2016-12-12T18:35:12Z,2017-01-23T20:21:14Z,2017-01-23T20:21:14Z,CONTRIBUTOR,,,,"I've been trying to get around some performance issues when subsetting a  set of netCDF files opend with `open_mfdataset`. I managed to print out the generated dask graph for one variable and it doesn't seem right - it's huge, 5000 elements, and seems to have a getitem entry for every requested element for that variable.

The code that generates this select looks roughly like:

```python

paths = WEATHER_MET['latlon'].glob('*_resampled.nc')
dataset = xr.open_mfdataset([str(p) for p in paths])
selection = dataset.sel(time=time_sel).sel_points(method='nearest', tolerance=0.1, lon=lon, lat=lat)
selection *= weights
```

and the graph for one variable in the select (the irradiance value) looks like this:

[mydask.pdf](https://github.com/pydata/xarray/files/646830/mydask.pdf)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1161/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue
195125296,MDExOlB1bGxSZXF1ZXN0OTc2NjMxMTg=,1162,#1161 WIP to vectorize isel_points,743508,closed,0,,,15,2016-12-13T00:19:46Z,2017-01-23T20:20:51Z,2017-01-23T20:20:47Z,CONTRIBUTOR,,0,pydata/xarray/pulls/1162,WIP to use dask vindex to point based selection,"{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1162/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull