html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/3213#issuecomment-1534724554,https://api.github.com/repos/pydata/xarray/issues/3213,1534724554,IC_kwDOAMm_X85begnK,1197350,2023-05-04T12:51:59Z,2023-05-04T12:51:59Z,MEMBER,"> I suspect (but don't know, as I'm just a user of xarray, not a developer) that it's also not thoroughly _tested_.
Existing sparse testing is here: https://github.com/pydata/xarray/blob/main/xarray/tests/test_sparse.py
We would welcome enhancements to this!
","{""total_count"": 2, ""+1"": 2, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,479942077
https://github.com/pydata/xarray/issues/3213#issuecomment-1534001190,https://api.github.com/repos/pydata/xarray/issues/3213,1534001190,IC_kwDOAMm_X85bbwAm,1197350,2023-05-04T02:36:57Z,2023-05-04T02:36:57Z,MEMBER,"Hi @jdbutler and welcome! We would welcome this sort of contribution eagerly.
I would characterize our current support of sparse arrays as really just a proof of concept. When to use sparse and how to do it effectively is not well documented. Simply adding more documentation around the already-supported use cases would be a great place to start IMO.
My own exploration of this are described in [this Pangeo post](https://discourse.pangeo.io/t/conservative-region-aggregation-with-xarray-geopandas-and-sparse/2715). The use case is regridding. It touches on quite a few of the points you're interested in, in particular the integration with geodataframe. Along similar lines, @dcherian has been working on using opt_einsum together with sparse in https://github.com/pangeo-data/xESMF/issues/222#issuecomment-1524041837 and https://github.com/pydata/xarray/issues/7764.
I'd also suggest catching up on what @martinfleis is doing with vector data cubes in [xvec](https://github.com/xarray-contrib/xvec). (See also [Pangeo post on this topic](https://discourse.pangeo.io/t/vector-data-cubes/2904).)
Of the three topics you enumerated, I'm most interested in the serialization one. However, I'd rather see serialization of sparse arrays prototyped in Zarr, as its much more conducive to experimentation than NetCDF (which requires writing C to do anything custom). I would recommend exploring serialization from a sparse array in memory to a sparse format on disk via a [custom codec](https://numcodecs.readthedocs.io/). Zarr recently added support for a `meta_array` parameter that determines what array type is materialized by the codec pipeline (see https://github.com/zarr-developers/zarr-python/pull/1131). The use case there was loading data [direct to GPU](https://xarray.dev/blog/xarray-kvikio). In a way sparse is similar--it's an array container that is not numpy or dask.
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,479942077