home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

5 rows where issue = 638947370 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 3

  • dcherian 2
  • fujiisoup 2
  • dschwoerer 1

author_association 2

  • MEMBER 4
  • CONTRIBUTOR 1

issue 1

  • writing sparse to netCDF · 5 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
844090889 https://github.com/pydata/xarray/issues/4156#issuecomment-844090889 https://api.github.com/repos/pydata/xarray/issues/4156 MDEyOklzc3VlQ29tbWVudDg0NDA5MDg4OQ== dcherian 2448579 2021-05-19T13:09:25Z 2021-05-19T13:09:25Z MEMBER

There is a more standards-compliant version here:https://github.com/pydata/xarray/issues/1077#issuecomment-644803374

This is still blocked on choosing which CF representation to use for sparse vs which one to use for MultiIndex.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  writing sparse to netCDF 638947370
843971807 https://github.com/pydata/xarray/issues/4156#issuecomment-843971807 https://api.github.com/repos/pydata/xarray/issues/4156 MDEyOklzc3VlQ29tbWVudDg0Mzk3MTgwNw== dschwoerer 5637662 2021-05-19T10:33:08Z 2021-05-19T10:33:08Z CONTRIBUTOR

I have hacked something that does support the reading and writing of sparse arrays to a netcdf file, however I didn't know how and where to put this within xarray.

``` def ds_to_netcdf(ds, fn): dsorg = ds ds = dsorg.copy() for v in ds: if hasattr(ds[v].data, "nnz") and ( hasattr(ds[v].data, "to_coo") or hasattr(ds[v].data, "linear_loc") ): coord = f"{v}_xarray_index" assert coord not in ds data = ds[v].data if hasattr(data, "to_coo"): data = data.to_coo() ds[coord] = coord, data.linear_loc() dims = ds[v].dims ds[coord].attrs["compress"] = " ".join(dims) at = ds[v].attrs ds[v] = coord, data.data ds[v].attrs = at ds[v].attrs["fill_value"] = str(data.fill_value) for d in dims: if d not in ds: ds[f"_len{d}"] = len(dsorg[d])

print(ds)
ds.to_netcdf(fn)

```

``` def xr_open_dataset(fn): ds = xr.open_dataset(fn)

def fromflat(shape, i):
    index = []
    for fac in shape[::-1]:
        index.append(i % fac)
        i //= fac
    return tuple(index[::-1])

for c in ds.coords:
    if "compress" in ds[c].attrs:
        vs = c.split("_")
        if len(vs) < 5:
            continue
        if vs[-1] != "" or vs[-2] != "index" or vs[-3] != "xarray":
            continue
        v = "_".join(vs[1:-3])
        at = ds[v].attrs
        dat = ds[v].data
        fill = ds[v].attrs.pop("_fill_value", None)
        if fill:
            knownfails = {"nan": np.nan, "False": False, "True": True}
            if fill in knownfails:
                fill = knownfails[fill]
            else:
                fill = np.fromstring(fill, dtype=dat.dtype)
        dims = ds[c].attrs["compress"].split()
        shape = []
        for d in dims:
            try:
                shape.append(len(ds[d]))
            except KeyError:
                shape.append(int(ds[f"_len_{d}"].data))
                ds = ds.drop_vars(f"_len_{d}")

        locs = fromflat(shape, ds[c].data)
        data = sparse.COO(locs, ds[v].data, shape, fill_value=fill)
        ds[v] = dims, data, ds[v].attrs, ds[v].encoding
print(ds)
return ds

```

Has there been any progress since last year?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  writing sparse to netCDF 638947370
644417331 https://github.com/pydata/xarray/issues/4156#issuecomment-644417331 https://api.github.com/repos/pydata/xarray/issues/4156 MDEyOklzc3VlQ29tbWVudDY0NDQxNzMzMQ== fujiisoup 6815844 2020-06-15T22:13:50Z 2020-06-15T22:13:50Z MEMBER

Do we already have something similar encoding (and decoding) scheme to write (and read) data? (does CFTime use a similar scheme?) I think we don't have a scheme to save multiindex yet but need to manually convert by reset_index.

1077

Maybe we can decide this encoding-decoding API before #1603.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  writing sparse to netCDF 638947370
644372749 https://github.com/pydata/xarray/issues/4156#issuecomment-644372749 https://api.github.com/repos/pydata/xarray/issues/4156 MDEyOklzc3VlQ29tbWVudDY0NDM3Mjc0OQ== dcherian 2448579 2020-06-15T20:32:01Z 2020-06-15T20:32:01Z MEMBER

Yes I think we will have to "encode" to something like this example dimensions: lat=73; lon=96; landpoint=2381; depth=4; variables: int landpoint(landpoint); landpoint:compress="lat lon"; float landsoilt(depth,landpoint); landsoilt:long_name="soil temperature"; landsoilt:units="K"; float depth(depth); float lat(lat); float lon(lon); data: landpoint=363, 364, 365, ...;

and then write that "encoded" dataset to file.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  writing sparse to netCDF 638947370
644368878 https://github.com/pydata/xarray/issues/4156#issuecomment-644368878 https://api.github.com/repos/pydata/xarray/issues/4156 MDEyOklzc3VlQ29tbWVudDY0NDM2ODg3OA== fujiisoup 6815844 2020-06-15T20:27:37Z 2020-06-15T20:27:37Z MEMBER

@dcherian Though I have no experience with this gather compression, it looks that python-netcdf4 does not have this function impremented.

One thing we can do is sparse -> multiindex -> reset_index > netCDF or maybe we can even add a function to skip constructing a multiindex but just make flattened index arrays from a sparse array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  writing sparse to netCDF 638947370

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 12.22ms · About: xarray-datasette