id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
503711327,MDU6SXNzdWU1MDM3MTEzMjc=,3381,concat() fails when args have sparse.COO data and different fill values,1634164,open,0,,,4,2019-10-07T21:54:06Z,2021-07-08T17:43:57Z,,NONE,,,,"#### MCVE Code Sample

```python
import numpy as np
import pandas as pd
import sparse
import xarray as xr

# Indices and raw data
foo = [f'foo{i}' for i in range(6)]
bar = [f'bar{i}' for i in range(6)]
raw = np.random.rand(len(foo) // 2, len(bar))

# DataArray
a = xr.DataArray(
    data=sparse.COO.from_numpy(raw),
    coords=[foo[:3], bar],
    dims=['foo', 'bar'])

print(a.data.fill_value)  # 0.0


# Created from a pd.Series
b_series = pd.DataFrame(raw, index=foo[3:], columns=bar) \
             .stack() \
             .rename_axis(index=['foo', 'bar'])
b = xr.DataArray.from_series(b_series, sparse=True)

print(b.data.fill_value)  # nan


# Works despite inconsistent fill-values
a + b
a * b


# Fails: complains about inconsistent fill-values
# xr.concat([a, b], dim='foo')  # ***

# The fill_value argument doesn't help
# xr.concat([a, b], dim='foo', fill_value=np.nan)


def fill_value(da):
    """"""Try to coerce one argument to a consistent fill-value.""""""
    return xr.DataArray(
        data=sparse.as_coo(da.data, fill_value=np.nan),
        coords=da.coords,
        dims=da.dims,
        name=da.name,
        attrs=da.attrs,
        )


# Fails: ""Cannot provide a fill-value in combination with something that
# already has a fill-value""
# print(xr.concat([a.pipe(fill_value), b], dim='foo'))


# If we cheat by recreating 'a' from scratch, copying the fill value of the
# intended other argument, it works again:
a = xr.DataArray(
    data=sparse.COO.from_numpy(raw, fill_value=b.data.fill_value),
    coords=[foo[:3], bar],
    dims=['foo', 'bar'])
c = xr.concat([a, b], dim='foo')

print(c.data.fill_value)  # nan

# But simple operations again create objects with potentially incompatible
# fill-values
d = c.sum(dim='bar')
print(d.data.fill_value)  # 0.0
```

#### Expected

`concat()` can be used without having to create new objects; i.e. the line marked `***` just works.

#### Problem Description
Some basic xarray manipulations don't work on `sparse.COO`-backed objects.

xarray should automatically coerce objects into a compatible state, or at least provide users with methods to do so. Behaviour should also be documented, e.g. in this instance, which operations (here, `.sum()`) modify the underlying storage format in ways that necessitate some kind of (re-)conversion.

#### Output of ``xr.show_versions()``
<details>

INSTALLED VERSIONS                                                                                                                                                               
------------------                                                                                                                                                               
commit: None
python: 3.7.3 (default, Aug 20 2019, 17:04:43)
[GCC 8.3.0]
python-bits: 64
OS: Linux
OS-release: 5.0.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8
libhdf5: 1.10.4
libnetcdf: 4.6.2

xarray: 0.13.0
pandas: 0.25.0
numpy: 1.17.2
scipy: 1.2.1
netCDF4: 1.4.2
pydap: None
h5netcdf: 0.7.1
h5py: 2.8.0
Nio: None
zarr: None
cftime: 1.0.3.4
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: 1.2.1
dask: 2.1.0
distributed: None
matplotlib: 3.1.1
cartopy: 0.17.0
seaborn: 0.9.0
numbagg: None
setuptools: 40.8.0
pip: 19.2.3
conda: None
pytest: 5.0.1
IPython: 5.8.0
sphinx: 2.2.0
</details>
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3381/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
489825483,MDU6SXNzdWU0ODk4MjU0ODM=,3281,"[proposal] concatenate by axis, ignore dimension names",1200058,open,0,,,4,2019-09-05T15:06:22Z,2021-07-08T17:42:53Z,,NONE,,,,"Hi, I wrote a helper function which allows to concatenate arrays like `xr.combine_nested` with the difference that it only supports `xr.DataArrays`, concatenates them by axis position similar to `np.concatenate` and overwrites all dimension names.

I often need this to combine very different feature types.

```python
from typing import Union, Tuple, List
import numpy as np
import xarray as xr

def concat_by_axis(
        darrs: Union[List[xr.DataArray], Tuple[xr.DataArray]],
        dims: Union[List[str], Tuple[str]],
        axis: int = None,
        **kwargs
):
    """"""
    Concat arrays along some axis similar to `np.concatenate`. Automatically renames the dimensions to `dims`.
    Please note that this renaming happens by the axis position, therefore make sure to transpose all arrays
    to the correct dimension order.

    :param darrs: List or tuple of xr.DataArrays
    :param dims: The dimension names of the resulting array. Renames axes where necessary.
    :param axis: The axis which should be concatenated along
    :param kwargs: Additional arguments which will be passed to `xr.concat()`
    :return: Concatenated xr.DataArray with dimensions `dim`.
    """"""

    # Get depth of nested lists. Assumes `darrs` is correctly formatted as list of lists.
    if axis is None:
        axis = 0
        l = darrs
        # while l is a list or tuple and contains elements:
        while isinstance(l, List) or isinstance(l, Tuple) and l:
            # increase depth by one
            axis -= 1
            l = l[0]
        if axis == 0:
            raise ValueError(""`darrs` has to be a (possibly nested) list or tuple of xr.DataArrays!"")

    to_concat = list()
    for i, da in enumerate(darrs):
        # recursive call for nested arrays;
        # most inner call should have axis = -1,
        # most outer call should have axis = - depth_of_darrs
        if isinstance(da, list) or isinstance(da, tuple):
            da = concat_axis(da, dims=dims, axis=axis + 1, **kwargs)

        if not isinstance(da, xr.DataArray):
            raise ValueError(""Input %d must be a xr.DataArray"" % i)
        if len(da.dims) != len(dims):
            raise ValueError(""Input %d must have the same number of dimensions as specified in the `dims` argument!"" % i)

        # force-rename dimensions
        da = da.rename(dict(zip(da.dims, dims)))

        to_concat.append(da)

    return xr.concat(to_concat, dim=dims[axis], **kwargs)

```

Would it make sense to include this in xarray?","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3281/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue
223231729,MDU6SXNzdWUyMjMyMzE3Mjk=,1379,xr.concat consuming too much resources,7799184,open,0,,,4,2017-04-20T23:33:52Z,2021-07-08T17:42:18Z,,CONTRIBUTOR,,,,"Hi,
I am reading in several (~1000) small ascii files into Dataset objects and trying to concatenate them over one specific dimension but I eventually blow my memory up. The file glob is not huge (~700M, my computer has ~16G) and I can do it fine if I only read in the Datasets appending them to a list without concatenating them (my memory increases by 5% only or so by the time I had read them all).

However, when trying to concatenate each file into one single Dataset upon reading over a loop, the processing speeds drastically reduce before I have read 10% of the files or so and my memory usage keeps going up until it eventually blows up before I read and concatenate 30% of these files (the screenshot below was taken before it blew up, the memory usage was under 20% by the start of the processing).

I was wondering if this is expected, or if there something that could be improved to make that work more efficiently please. I'm changing my approach now by extracting numpy arrays from the individual Datasets, concatenating these numpy arrays and defining the final Dataset only at the end.

Thanks.

![screenshot from 2017-04-21 11-14-27](https://cloud.githubusercontent.com/assets/7799184/25256452/e7cdd4b4-2684-11e7-9c27-e28c76317a77.png)
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/1379/reactions"", ""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue