home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

7 rows where issue = 342180429 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: created_at (date), updated_at (date)

user 4

  • shoyer 3
  • fujiisoup 2
  • mrocklin 1
  • max-sixty 1

issue 1

  • Making xarray math lazy · 7 ✖

author_association 1

  • MEMBER 7
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1094068301 https://github.com/pydata/xarray/issues/2298#issuecomment-1094068301 https://api.github.com/repos/pydata/xarray/issues/2298 IC_kwDOAMm_X85BNihN max-sixty 5635139 2022-04-09T15:31:31Z 2022-04-09T15:31:31Z MEMBER

Any thoughts on the current status on this?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
406672721 https://github.com/pydata/xarray/issues/2298#issuecomment-406672721 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNjY3MjcyMQ== shoyer 1217238 2018-07-20T17:35:26Z 2018-07-20T17:35:26Z MEMBER

Indeed, I really like the look of https://github.com/dask/dask/issues/2538 and its implementation in https://github.com/dask/dask/pull/2608. It doesn't solve the indexing optimization yet but that could be pretty straightforward to add -- especially once we add a notion of explicit indexing types (basic vs outer vs vectorized) directly into dask.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
406572020 https://github.com/pydata/xarray/issues/2298#issuecomment-406572020 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNjU3MjAyMA== mrocklin 306380 2018-07-20T11:20:59Z 2018-07-20T11:20:59Z MEMBER

Two thoughts:

  1. We can push some of this into Dask with https://github.com/dask/dask/issues/2538
  2. The full lazy ndarray solution would be a good application of the __array_function__ protocol
{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
406449411 https://github.com/pydata/xarray/issues/2298#issuecomment-406449411 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNjQ0OTQxMQ== shoyer 1217238 2018-07-20T00:02:34Z 2018-07-20T00:02:34Z MEMBER

Therefore, personally, I'd like to see this lazy math by implementing a lazy array. The API I thought of is .to_lazy() which converts the backend to the lazy array, as similar to that .chunk() converts the backend to dask array.

This is not a bad idea, but the version of lazy arithmetic that I have been contemplating (see https://github.com/pydata/xarray/pull/2302) is not yet complete. For example, it doesn't have any way to represent a lazy aggregation.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
406128252 https://github.com/pydata/xarray/issues/2298#issuecomment-406128252 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNjEyODI1Mg== fujiisoup 6815844 2018-07-19T01:47:36Z 2018-07-19T01:47:36Z MEMBER

Thanks, @shoyer

we have enough related functionality (e.g., for lazy and explicit indexing)

Agreed. Actually, it sounds very fun to code the lazy arithmetics.

Ideally, this logic would live outside xarray.

Yes, I concerned about this. We have discussed to support more kinds of array-likes (e.g. dask, sparse, cupy) in #1938, and I thought the lazy array can be (ideally) one of them.

But in practice, it should take a long time to realize the any-array-like support and it might be a good idea to natively support the lazy mathematics for now. If we are heading to any-array-like support, I think that the implementation of the lazy array should be as isolated from xarray core logic as possible so that we can move smoothly to the any-array-like support in the future.

Therefore, personally, I'd like to see this lazy math by implementing a lazy array. The API I thought of is .to_lazy() which converts the backend to the lazy array, as similar to that .chunk() converts the backend to dask array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
406104928 https://github.com/pydata/xarray/issues/2298#issuecomment-406104928 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNjEwNDkyOA== shoyer 1217238 2018-07-18T23:26:57Z 2018-07-18T23:26:57Z MEMBER

The main practical difference is that it allows us to reliably guarantee that expressions like f(x, y)[i] always get evaluated like f(x[i], y[i]). Dask doesn't have this optimization yet (https://github.com/dask/dask/issues/746), so indexing operations still compute the function f() on each block of an array. This issue provides full context from the xarray side: https://github.com/pydata/xarray/issues/1725

The typical example is spatially referenced imagery, e.g., a 2D satellite photo of the surface of the Earth with 2D latitude/longitude coordinates associated with each point. It would be very expensive to store full latitude and longitude arrays, but fortunately they can usually be computed cheaply from row and column indices.

Ideally, this logic would live outside xarray. But it's important enough to some xarray users (especially geoscience + astronomy) and we have enough related functionality (e.g., for lazy and explicit indexing) that it probably makes sense to add it.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429
405817290 https://github.com/pydata/xarray/issues/2298#issuecomment-405817290 https://api.github.com/repos/pydata/xarray/issues/2298 MDEyOklzc3VlQ29tbWVudDQwNTgxNzI5MA== fujiisoup 6815844 2018-07-18T05:53:18Z 2018-07-18T05:53:18Z MEMBER

This sounds interesting. I am curious what the practical difference from dask is. Does it mean some maths are lazy by default (without any external library)?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Making xarray math lazy 342180429

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 10.338ms · About: xarray-datasette