home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

2 rows where issue = 479942077 and user = 47371188 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 1

  • p-d-moore · 2 ✖

issue 1

  • How should xarray use/support sparse arrays? · 2 ✖

author_association 1

  • NONE 2
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
527771975 https://github.com/pydata/xarray/issues/3213#issuecomment-527771975 https://api.github.com/repos/pydata/xarray/issues/3213 MDEyOklzc3VlQ29tbWVudDUyNzc3MTk3NQ== p-d-moore 47371188 2019-09-04T07:05:37Z 2019-09-04T07:05:37Z NONE

Thanks @crusaderky, appreciated. Might as as well suggest it there.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How should xarray use/support sparse arrays? 479942077
527762609 https://github.com/pydata/xarray/issues/3213#issuecomment-527762609 https://api.github.com/repos/pydata/xarray/issues/3213 MDEyOklzc3VlQ29tbWVudDUyNzc2MjYwOQ== p-d-moore 47371188 2019-09-04T06:32:21Z 2019-09-04T06:32:21Z NONE

I would like to add a request for sparse xarrays: Support ffill and bfill operations along ordered dimensions (such as datetime coordinates) while maintaining the sparse level of data density.

The challenge to overcome is that performing ffill operations on sparse data quickly creates data that is no longer "sparse" in practice and makes dealing with the data challenging.

My suggested implementation (and the way I have previously done this in another programming environment) is to represent the data as rows of contiguous regions with a single (non-sparse) value rather than rows of single points. The contiguous dimensions could be defined as any dimensions that are "ordered" such as datetime coordinates. That is, the data then is represented as a list of values + coordinate ranges rather than a list of values + coordinates.

The idea is that you can easily compute operations like ffill without changing the sparsity of the matrix, and thus support typical aggregating functions you might like to apply to the data before you collapse the data and convert to a non-sparse form (e.g. perform a lag difference of the most recent value with the most recent value 20 days ago, or do a cross-sectional mean on the data along a certain dimension, using the most recent data at each given point in time). These types of operations can be more useful when the data is "fuller" such as after a forward fill, but often not useful when the data is very sparsely populated (as the cross-sectional operations are unlikely to hit the sparse data among the different dimensions).

Care must be taken to avoid "collisions" between sparse blocks of data, that is, avoiding that the list of sparse blocks accidentally overlap. The implementation can get tricky but I believe the goal to be worthwhile.

I am happy to expand on the request if the idea is not well expressed.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  How should xarray use/support sparse arrays? 479942077

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 18.377ms · About: xarray-datasette