id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type 1316423844,I_kwDOAMm_X85Odwik,6822,RuntimeError when formatting sparse-backed DataArray in f-string,1634164,closed,0,,,2,2022-07-25T07:58:11Z,2022-08-09T09:17:39Z,2022-08-08T15:11:35Z,NONE,,,,"### What happened? On upgrading from xarray 2022.3.0 to 2022.6.0, f-string formatting of sparse-backed DataArray raises an exception. ### What did you expect to happen? - Code does not error, or - A breaking change is listed in the [“Breaking changes”](https://docs.xarray.dev/en/stable/whats-new.html#breaking-changes) section of the docs. ### Minimal Complete Verifiable Example ```Python import pandas as pd import xarray as xr s = pd.Series( range(4), index=pd.MultiIndex.from_product([list(""ab""), list(""cd"")]), ) da = xr.DataArray.from_series(s, sparse=True) print(f""{da}"") ``` ### MVCE confirmation - [X] Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray. - [X] Complete example — the example is self-contained, including all data and the text of any traceback. - [X] Verifiable example — the example copy & pastes into an IPython prompt or [Binder notebook](https://mybinder.org/v2/gh/pydata/xarray/main?urlpath=lab/tree/doc/examples/blank_template.ipynb), returning the result. - [X] New issue — a search of GitHub Issues suggests this is not a duplicate. ### Relevant log output ```Python # xarray 2022.3.0: Coordinates: * level_0 (level_0) object 'a' 'b' * level_1 (level_1) object 'c' 'd' # xarray 2022.6.0: Traceback (most recent call last): File ""/home/khaeru/bug.py"", line 11, in print(f""{da}"") File ""/home/khaeru/.local/lib/python3.10/site-packages/xarray/core/common.py"", line 168, in __format__ return self.values.__format__(format_spec) File ""/home/khaeru/.local/lib/python3.10/site-packages/xarray/core/dataarray.py"", line 685, in values return self.variable.values File ""/home/khaeru/.local/lib/python3.10/site-packages/xarray/core/variable.py"", line 527, in values return _as_array_or_item(self._data) File ""/home/khaeru/.local/lib/python3.10/site-packages/xarray/core/variable.py"", line 267, in _as_array_or_item data = np.asarray(data) File ""/home/khaeru/.local/lib/python3.10/site-packages/sparse/_sparse_array.py"", line 229, in __array__ raise RuntimeError( RuntimeError: Cannot convert a sparse array to dense automatically. To manually densify, use the todense method. ``` ### Anything else we need to know? Along with the versions below, I have confirmed the error occurs with both sparse 0.12 and sparse 0.13. ### Environment
INSTALLED VERSIONS ------------------ commit: None python: 3.10.4 (main, Jun 29 2022, 12:14:53) [GCC 11.2.0] python-bits: 64 OS: Linux OS-release: 5.15.0-41-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: ('en_CA', 'UTF-8') libhdf5: 1.10.7 libnetcdf: 4.8.1 xarray: 2022.6.0 pandas: 1.4.2 numpy: 1.22.4 scipy: 1.8.0 netCDF4: 1.5.8 pydap: None h5netcdf: 0.12.0 h5py: 3.6.0 Nio: None zarr: None cftime: 1.5.2 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.3.2 dask: 2022.01.0+dfsg distributed: 2022.01.0+ds.1 matplotlib: 3.5.1 cartopy: 0.20.2 seaborn: 0.11.2 numbagg: None fsspec: 2022.01.0 cupy: None pint: 0.18 sparse: 0.13.0 flox: None numpy_groupies: None setuptools: 62.1.0 pip: 22.0.2 conda: None pytest: 6.2.5 IPython: 7.31.1 sphinx: 4.5.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/6822/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 606846911,MDExOlB1bGxSZXF1ZXN0NDA4OTY0MTM3,4007,Allow DataArray.to_series() without invoking sparse.COO.todense(),1634164,open,0,,,1,2020-04-25T20:15:16Z,2022-06-09T14:50:17Z,,FIRST_TIME_CONTRIBUTOR,,0,pydata/xarray/pulls/4007,"This adds some code (from iiasa/ixmp#317) that allows DataArray.to_series() to be called without invoking sparse.COO.todense() when that is the backing data type. I'm aware this needs some improvement to meet the standard of the existing codebase, so I hope I could ask for some guidance on how to address the following points (including whom to ask about them): - [ ] Make the same improvement in {DataArray,Dataset}.to_dataframe(). - [ ] Possibly move the code out of dataarray.py to a more appropriate location (where?). - [ ] Possibly check for sparse.COO explicitly instead of xarray.core.pycompat.sparse_array_type. Other SparseArray subclasses, e.g. DOK, may not have the same attributes. Standard items: - [ ] Tests added. - [x] Passes `isort -rc . && black . && mypy . && flake8` (Sort of: these wanted to modify 7 files beyond the one I touched; didn't commit these changes.) - [ ] Fully documented, including `whats-new.rst` for all changes and `api.rst` for new API.","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/4007/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull 503711327,MDU6SXNzdWU1MDM3MTEzMjc=,3381,concat() fails when args have sparse.COO data and different fill values,1634164,open,0,,,4,2019-10-07T21:54:06Z,2021-07-08T17:43:57Z,,NONE,,,,"#### MCVE Code Sample ```python import numpy as np import pandas as pd import sparse import xarray as xr # Indices and raw data foo = [f'foo{i}' for i in range(6)] bar = [f'bar{i}' for i in range(6)] raw = np.random.rand(len(foo) // 2, len(bar)) # DataArray a = xr.DataArray( data=sparse.COO.from_numpy(raw), coords=[foo[:3], bar], dims=['foo', 'bar']) print(a.data.fill_value) # 0.0 # Created from a pd.Series b_series = pd.DataFrame(raw, index=foo[3:], columns=bar) \ .stack() \ .rename_axis(index=['foo', 'bar']) b = xr.DataArray.from_series(b_series, sparse=True) print(b.data.fill_value) # nan # Works despite inconsistent fill-values a + b a * b # Fails: complains about inconsistent fill-values # xr.concat([a, b], dim='foo') # *** # The fill_value argument doesn't help # xr.concat([a, b], dim='foo', fill_value=np.nan) def fill_value(da): """"""Try to coerce one argument to a consistent fill-value."""""" return xr.DataArray( data=sparse.as_coo(da.data, fill_value=np.nan), coords=da.coords, dims=da.dims, name=da.name, attrs=da.attrs, ) # Fails: ""Cannot provide a fill-value in combination with something that # already has a fill-value"" # print(xr.concat([a.pipe(fill_value), b], dim='foo')) # If we cheat by recreating 'a' from scratch, copying the fill value of the # intended other argument, it works again: a = xr.DataArray( data=sparse.COO.from_numpy(raw, fill_value=b.data.fill_value), coords=[foo[:3], bar], dims=['foo', 'bar']) c = xr.concat([a, b], dim='foo') print(c.data.fill_value) # nan # But simple operations again create objects with potentially incompatible # fill-values d = c.sum(dim='bar') print(d.data.fill_value) # 0.0 ``` #### Expected `concat()` can be used without having to create new objects; i.e. the line marked `***` just works. #### Problem Description Some basic xarray manipulations don't work on `sparse.COO`-backed objects. xarray should automatically coerce objects into a compatible state, or at least provide users with methods to do so. Behaviour should also be documented, e.g. in this instance, which operations (here, `.sum()`) modify the underlying storage format in ways that necessitate some kind of (re-)conversion. #### Output of ``xr.show_versions()``
INSTALLED VERSIONS ------------------ commit: None python: 3.7.3 (default, Aug 20 2019, 17:04:43) [GCC 8.3.0] python-bits: 64 OS: Linux OS-release: 5.0.0-32-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_CA.UTF-8 LOCALE: en_CA.UTF-8 libhdf5: 1.10.4 libnetcdf: 4.6.2 xarray: 0.13.0 pandas: 0.25.0 numpy: 1.17.2 scipy: 1.2.1 netCDF4: 1.4.2 pydap: None h5netcdf: 0.7.1 h5py: 2.8.0 Nio: None zarr: None cftime: 1.0.3.4 nc_time_axis: None PseudoNetCDF: None rasterio: None cfgrib: None iris: None bottleneck: 1.2.1 dask: 2.1.0 distributed: None matplotlib: 3.1.1 cartopy: 0.17.0 seaborn: 0.9.0 numbagg: None setuptools: 40.8.0 pip: 19.2.3 conda: None pytest: 5.0.1 IPython: 5.8.0 sphinx: 2.2.0
","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3381/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,issue 143764621,MDU6SXNzdWUxNDM3NjQ2MjE=,805,pd.Period can't be used as a 1-element coord,1634164,closed,0,,,5,2016-03-27T00:45:52Z,2016-12-24T00:09:48Z,2016-12-24T00:09:48Z,NONE,,,,"With xarray 0.7.2, following [this basic example from the docs](http://xarray.pydata.org/en/stable/data-structures.html#creating-a-dataset), but with a modification in the last line to use `pd.Period` instead of `pd.Timestamp`: ``` python import numpy as np import xarray as xr temp = 15 + 8 * np.random.randn(2, 2, 3) precip = 10 * np.random.rand(2, 2, 3) lon = [[-99.83, -99.32], [-99.79, -99.23]] lat = [[42.25, 42.21], [42.63, 42.59]] ds = xr.Dataset({'temperature': (['x', 'y', 'time'], temp), 'precipitation': (['x', 'y', 'time'], precip)}, coords={'lon': (['x', 'y'], lon), 'lat': (['x', 'y'], lat), 'time': pd.date_range('2014-09-06', periods=3), 'reference_time': pd.Period('2014')}) ``` This raises: ``` ValueError: dimensions ('reference_time',) must have the same length as the number of data dimensions, ndim=0 ``` I noticed (#645) that there are other issues stemming from pandas' PeriodIndex & company, so if this is not a straightforward fix I will understand! ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/805/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue 70805273,MDExOlB1bGxSZXF1ZXN0MzQwODk5MDk=,401,Handle bool in NetCDF4 conversion,1634164,closed,0,,,9,2015-04-24T21:59:08Z,2016-05-26T18:51:06Z,2016-05-23T04:54:40Z,FIRST_TIME_CONTRIBUTOR,,0,pydata/xarray/pulls/401,"I am working on some code that creates `xray.Datasets` with a 'bool' dtype. Trying to call `Dataset.to_netcdf()` on this code causes `_nc4_values_and_dtype()` to raise a `ValueError`, so I added these few lines to force the storage of these variables as 1-byte integers. Perhaps it should be 'u1' instead; I can change that if need be. ","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/401/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,,13221727,pull