home / github

Menu
  • GraphQL API
  • Search all tables

issue_comments

Table actions
  • GraphQL API for issue_comments

12 rows where author_association = "MEMBER" and issue = 295270362 sorted by updated_at descending

✎ View and edit SQL

This data as json, CSV (advanced)

Suggested facets: reactions, created_at (date), updated_at (date)

user 4

  • mrocklin 6
  • shoyer 4
  • jhamman 1
  • TomNicholas 1

issue 1

  • Avoid Adapters in task graphs? · 12 ✖

author_association 1

  • MEMBER · 12 ✖
id html_url issue_url node_id user created_at updated_at ▲ author_association body reactions performed_via_github_app issue
1115288678 https://github.com/pydata/xarray/issues/1895#issuecomment-1115288678 https://api.github.com/repos/pydata/xarray/issues/1895 IC_kwDOAMm_X85CefRm TomNicholas 35968931 2022-05-02T19:41:01Z 2022-05-02T19:41:01Z MEMBER

Maybe we add an option to from_array to have it inline the array into the task, rather than create an explicit dependency.

dask.array.from_array does now have an inline_array option, which I've just exposed in open_dataset in #6566. I think that would be a reasonable way to close this issue?

{
    "total_count": 1,
    "+1": 1,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
371813468 https://github.com/pydata/xarray/issues/1895#issuecomment-371813468 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM3MTgxMzQ2OA== mrocklin 306380 2018-03-09T13:35:38Z 2018-03-09T13:35:38Z MEMBER

If things are operational then we're fine. It may be that a lot of this cost was due to other serialization things in gcsfs, zarr, or other.

On Fri, Mar 9, 2018 at 12:33 AM, Joe Hamman notifications@github.com wrote:

Where did we land here? Is there an action item that came from this discussion?

In my view, the benefit of having consistent getitem behavior for all of our backends is worth working through potential hiccups in the way dask interacts with xarray.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pydata/xarray/issues/1895#issuecomment-371718136, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszGISdLyCz1vL3SwpdNv8CplC5hi1ks5tchQNgaJpZM4R9Svr .

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
371718136 https://github.com/pydata/xarray/issues/1895#issuecomment-371718136 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM3MTcxODEzNg== jhamman 2443309 2018-03-09T05:32:58Z 2018-03-09T05:32:58Z MEMBER

Where did we land here? Is there an action item that came from this discussion?

In my view, the benefit of having consistent getitem behavior for all of our backends is worth working through potential hiccups in the way dask interacts with xarray.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363948383 https://github.com/pydata/xarray/issues/1895#issuecomment-363948383 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2Mzk0ODM4Mw== shoyer 1217238 2018-02-07T23:33:01Z 2018-02-07T23:33:01Z MEMBER

Ah, this may actually require a non-trivial amount of IO. It currently takes a non-trivial amount of time to read a zarr file. See pangeo-data/pangeo#99 (comment) . We're doing this on each deserialization?

We're unpickling the zarr objects. I don't know if that requires IO (probably not).

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363936464 https://github.com/pydata/xarray/issues/1895#issuecomment-363936464 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkzNjQ2NA== mrocklin 306380 2018-02-07T22:42:40Z 2018-02-07T22:42:40Z MEMBER

Well, presumably opening a zarr file requires a small amount of IO to read out the metadata.

Ah, this may actually require a non-trivial amount of IO. It currently takes a non-trivial amount of time to read a zarr file. See https://github.com/pangeo-data/pangeo/issues/99#issuecomment-363782191 . We're doing this on each deserialization?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363935874 https://github.com/pydata/xarray/issues/1895#issuecomment-363935874 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkzNTg3NA== shoyer 1217238 2018-02-07T22:40:22Z 2018-02-07T22:40:22Z MEMBER

What makes it expensive?

Well, presumably opening a zarr file requires a small amount of IO to read out the metadata.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363932105 https://github.com/pydata/xarray/issues/1895#issuecomment-363932105 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkzMjEwNQ== mrocklin 306380 2018-02-07T22:25:45Z 2018-02-07T22:25:45Z MEMBER

No, not particularly, though potentially opening a zarr store could be a little expensive

What makes it expensive?

I'm mostly not sure how this would be done. Currently, we open files, create array objects, do some lazy decoding and then create dask arrays with from_array.

Maybe we add an option to from_array to have it inline the array into the task, rather than create an explicit dependency.

This does feel like I'm trying to duct tape over some underlying problem that I can't resolve though.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363931288 https://github.com/pydata/xarray/issues/1895#issuecomment-363931288 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkzMTI4OA== shoyer 1217238 2018-02-07T22:22:40Z 2018-02-07T22:22:40Z MEMBER

Do these objects happen to store any cached results? I'm seeing odd performance issues around these objects and am curious about any ways in which they might be fancy.

I don't think there's any caching here. All of these objects are stateless, though ZarrArrayWrapper does point back to a ZarrStore object and a zarr.Group object.

Any concerns about recreating these objects for every access?

No, not particularly, though potentially opening a zarr store could be a little expensive. I'm mostly not sure how this would be done. Currently, we open files, create array objects, do some lazy decoding and then create dask arrays with from_array.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363925208 https://github.com/pydata/xarray/issues/1895#issuecomment-363925208 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkyNTIwOA== mrocklin 306380 2018-02-07T21:59:56Z 2018-02-07T21:59:56Z MEMBER

Any concerns about recreating these objects for every access?

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363925086 https://github.com/pydata/xarray/issues/1895#issuecomment-363925086 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkyNTA4Ng== mrocklin 306380 2018-02-07T21:59:28Z 2018-02-07T21:59:28Z MEMBER

Do these objects happen to store any cached results? I'm seeing odd performance issues around these objects and am curious about any ways in which they might be fancy.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363921064 https://github.com/pydata/xarray/issues/1895#issuecomment-363921064 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2MzkyMTA2NA== shoyer 1217238 2018-02-07T21:44:33Z 2018-02-07T21:44:33Z MEMBER

In principle this is fine, especially if this object is cheap to serialize, move, and deserialize.

Yes, that should be the case here. Each of these array objects is very lightweight and should be quickly pickled/unpickled.

On the other hand, once evaluated these do correspond to a large chunk of data (entire arrays). If this future needs to be evaluated before being passed around that would be a problem. Getitem fusing is pretty essential here for performance.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362
363889835 https://github.com/pydata/xarray/issues/1895#issuecomment-363889835 https://api.github.com/repos/pydata/xarray/issues/1895 MDEyOklzc3VlQ29tbWVudDM2Mzg4OTgzNQ== mrocklin 306380 2018-02-07T19:52:18Z 2018-02-07T19:52:18Z MEMBER

https://github.com/pangeo-data/pangeo/issues/99#issuecomment-363852820

also cc @jhamman

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  Avoid Adapters in task graphs? 295270362

Advanced export

JSON shape: default, array, newline-delimited, object

CSV options:

CREATE TABLE [issue_comments] (
   [html_url] TEXT,
   [issue_url] TEXT,
   [id] INTEGER PRIMARY KEY,
   [node_id] TEXT,
   [user] INTEGER REFERENCES [users]([id]),
   [created_at] TEXT,
   [updated_at] TEXT,
   [author_association] TEXT,
   [body] TEXT,
   [reactions] TEXT,
   [performed_via_github_app] TEXT,
   [issue] INTEGER REFERENCES [issues]([id])
);
CREATE INDEX [idx_issue_comments_issue]
    ON [issue_comments] ([issue]);
CREATE INDEX [idx_issue_comments_user]
    ON [issue_comments] ([user]);
Powered by Datasette · Queries took 15.175ms · About: xarray-datasette