html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/4156#issuecomment-844090889,https://api.github.com/repos/pydata/xarray/issues/4156,844090889,MDEyOklzc3VlQ29tbWVudDg0NDA5MDg4OQ==,2448579,2021-05-19T13:09:25Z,2021-05-19T13:09:25Z,MEMBER,"There is a more standards-compliant version here:https://github.com/pydata/xarray/issues/1077#issuecomment-644803374 This is still blocked on choosing which CF representation to use for sparse vs which one to use for MultiIndex.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,638947370 https://github.com/pydata/xarray/issues/4156#issuecomment-843971807,https://api.github.com/repos/pydata/xarray/issues/4156,843971807,MDEyOklzc3VlQ29tbWVudDg0Mzk3MTgwNw==,5637662,2021-05-19T10:33:08Z,2021-05-19T10:33:08Z,CONTRIBUTOR,"I have hacked something that does support the reading and writing of sparse arrays to a netcdf file, however I didn't know how and where to put this within xarray. ``` def ds_to_netcdf(ds, fn): dsorg = ds ds = dsorg.copy() for v in ds: if hasattr(ds[v].data, ""nnz"") and ( hasattr(ds[v].data, ""to_coo"") or hasattr(ds[v].data, ""linear_loc"") ): coord = f""_{v}_xarray_index_"" assert coord not in ds data = ds[v].data if hasattr(data, ""to_coo""): data = data.to_coo() ds[coord] = coord, data.linear_loc() dims = ds[v].dims ds[coord].attrs[""compress""] = "" "".join(dims) at = ds[v].attrs ds[v] = coord, data.data ds[v].attrs = at ds[v].attrs[""_fill_value""] = str(data.fill_value) for d in dims: if d not in ds: ds[f""_len_{d}""] = len(dsorg[d]) print(ds) ds.to_netcdf(fn) ``` ``` def xr_open_dataset(fn): ds = xr.open_dataset(fn) def fromflat(shape, i): index = [] for fac in shape[::-1]: index.append(i % fac) i //= fac return tuple(index[::-1]) for c in ds.coords: if ""compress"" in ds[c].attrs: vs = c.split(""_"") if len(vs) < 5: continue if vs[-1] != """" or vs[-2] != ""index"" or vs[-3] != ""xarray"": continue v = ""_"".join(vs[1:-3]) at = ds[v].attrs dat = ds[v].data fill = ds[v].attrs.pop(""_fill_value"", None) if fill: knownfails = {""nan"": np.nan, ""False"": False, ""True"": True} if fill in knownfails: fill = knownfails[fill] else: fill = np.fromstring(fill, dtype=dat.dtype) dims = ds[c].attrs[""compress""].split() shape = [] for d in dims: try: shape.append(len(ds[d])) except KeyError: shape.append(int(ds[f""_len_{d}""].data)) ds = ds.drop_vars(f""_len_{d}"") locs = fromflat(shape, ds[c].data) data = sparse.COO(locs, ds[v].data, shape, fill_value=fill) ds[v] = dims, data, ds[v].attrs, ds[v].encoding print(ds) return ds ``` Has there been any progress since last year?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,638947370 https://github.com/pydata/xarray/issues/4156#issuecomment-644417331,https://api.github.com/repos/pydata/xarray/issues/4156,644417331,MDEyOklzc3VlQ29tbWVudDY0NDQxNzMzMQ==,6815844,2020-06-15T22:13:50Z,2020-06-15T22:13:50Z,MEMBER,"Do we already have something similar *encoding* (and decoding) scheme to write (and read) data? (does CFTime use a similar scheme?) I think we don't have a scheme to save multiindex yet but need to manually convert by `reset_index`. #1077 Maybe we can decide this encoding-decoding API before #1603.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,638947370 https://github.com/pydata/xarray/issues/4156#issuecomment-644372749,https://api.github.com/repos/pydata/xarray/issues/4156,644372749,MDEyOklzc3VlQ29tbWVudDY0NDM3Mjc0OQ==,2448579,2020-06-15T20:32:01Z,2020-06-15T20:32:01Z,MEMBER,"Yes I think we will have to ""encode"" to something like this example ``` dimensions: lat=73; lon=96; landpoint=2381; depth=4; variables: int landpoint(landpoint); landpoint:compress=""lat lon""; float landsoilt(depth,landpoint); landsoilt:long_name=""soil temperature""; landsoilt:units=""K""; float depth(depth); float lat(lat); float lon(lon); data: landpoint=363, 364, 365, ...; ``` and then write that ""encoded"" dataset to file.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,638947370 https://github.com/pydata/xarray/issues/4156#issuecomment-644368878,https://api.github.com/repos/pydata/xarray/issues/4156,644368878,MDEyOklzc3VlQ29tbWVudDY0NDM2ODg3OA==,6815844,2020-06-15T20:27:37Z,2020-06-15T20:27:37Z,MEMBER,"@dcherian Though I have no experience with this gather compression, it looks that python-netcdf4 does not have this function impremented. One thing we can do is `sparse -> multiindex -> reset_index > netCDF` or maybe we can even add a function to skip constructing a multiindex but just make flattened index arrays from a sparse array.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,638947370