html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1895#issuecomment-1115288678,https://api.github.com/repos/pydata/xarray/issues/1895,1115288678,IC_kwDOAMm_X85CefRm,35968931,2022-05-02T19:41:01Z,2022-05-02T19:41:01Z,MEMBER,"> Maybe we add an option to from_array to have it inline the array into the task, rather than create an explicit dependency.
`dask.array.from_array` does now have an `inline_array` option, which I've just exposed in `open_dataset` in #6566. I think that would be a reasonable way to close this issue?","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-743831363,https://api.github.com/repos/pydata/xarray/issues/1895,743831363,MDEyOklzc3VlQ29tbWVudDc0MzgzMTM2Mw==,26384082,2020-12-12T20:29:09Z,2020-12-12T20:29:09Z,NONE,"In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity
If this issue remains relevant, please comment here or remove the `stale` label; otherwise it will be marked as closed automatically
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-371813468,https://api.github.com/repos/pydata/xarray/issues/1895,371813468,MDEyOklzc3VlQ29tbWVudDM3MTgxMzQ2OA==,306380,2018-03-09T13:35:38Z,2018-03-09T13:35:38Z,MEMBER,"If things are operational then we're fine. It may be that a lot of this
cost was due to other serialization things in gcsfs, zarr, or other.
On Fri, Mar 9, 2018 at 12:33 AM, Joe Hamman
wrote:
> Where did we land here? Is there an action item that came from this
> discussion?
>
> In my view, the benefit of having consistent getitem behavior for all of
> our backends is worth working through potential hiccups in the way dask
> interacts with xarray.
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-371718136,https://api.github.com/repos/pydata/xarray/issues/1895,371718136,MDEyOklzc3VlQ29tbWVudDM3MTcxODEzNg==,2443309,2018-03-09T05:32:58Z,2018-03-09T05:32:58Z,MEMBER,"Where did we land here? Is there an action item that came from this discussion?
In my view, the benefit of having consistent `getitem` behavior for all of our backends is worth working through potential hiccups in the way dask interacts with xarray. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363948383,https://api.github.com/repos/pydata/xarray/issues/1895,363948383,MDEyOklzc3VlQ29tbWVudDM2Mzk0ODM4Mw==,1217238,2018-02-07T23:33:01Z,2018-02-07T23:33:01Z,MEMBER,"> Ah, this may actually require a non-trivial amount of IO. It currently takes a non-trivial amount of time to read a zarr file. See pangeo-data/pangeo#99 (comment) . We're doing this on each deserialization?
We're unpickling the zarr objects. I don't know if that requires IO (probably not).","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363936464,https://api.github.com/repos/pydata/xarray/issues/1895,363936464,MDEyOklzc3VlQ29tbWVudDM2MzkzNjQ2NA==,306380,2018-02-07T22:42:40Z,2018-02-07T22:42:40Z,MEMBER,"> Well, presumably opening a zarr file requires a small amount of IO to read out the metadata.
Ah, this may actually require a non-trivial amount of IO. It currently takes a non-trivial amount of time to read a zarr file. See https://github.com/pangeo-data/pangeo/issues/99#issuecomment-363782191 . We're doing this on each deserialization?","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363935874,https://api.github.com/repos/pydata/xarray/issues/1895,363935874,MDEyOklzc3VlQ29tbWVudDM2MzkzNTg3NA==,1217238,2018-02-07T22:40:22Z,2018-02-07T22:40:22Z,MEMBER,"> What makes it expensive?
Well, presumably opening a zarr file requires a small amount of IO to read out the metadata.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363932105,https://api.github.com/repos/pydata/xarray/issues/1895,363932105,MDEyOklzc3VlQ29tbWVudDM2MzkzMjEwNQ==,306380,2018-02-07T22:25:45Z,2018-02-07T22:25:45Z,MEMBER,"> No, not particularly, though potentially opening a zarr store could be a little expensive
What makes it expensive?
> I'm mostly not sure how this would be done. Currently, we open files, create array objects, do some lazy decoding and then create dask arrays with from_array.
Maybe we add an option to from_array to have it inline the array into the task, rather than create an explicit dependency.
This does feel like I'm trying to duct tape over some underlying problem that I can't resolve though.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363931288,https://api.github.com/repos/pydata/xarray/issues/1895,363931288,MDEyOklzc3VlQ29tbWVudDM2MzkzMTI4OA==,1217238,2018-02-07T22:22:40Z,2018-02-07T22:22:40Z,MEMBER,"> Do these objects happen to store any cached results? I'm seeing odd performance issues around these objects and am curious about any ways in which they might be fancy.
I don't think there's any caching here. All of these objects are stateless, though `ZarrArrayWrapper` does point back to a `ZarrStore` object and a `zarr.Group` object.
> Any concerns about recreating these objects for every access?
No, not particularly, though potentially opening a zarr store could be a little expensive. I'm mostly not sure how this would be done. Currently, we open files, create array objects, do some lazy decoding and then create dask arrays with `from_array`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363925208,https://api.github.com/repos/pydata/xarray/issues/1895,363925208,MDEyOklzc3VlQ29tbWVudDM2MzkyNTIwOA==,306380,2018-02-07T21:59:56Z,2018-02-07T21:59:56Z,MEMBER,Any concerns about recreating these objects for every access? ,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363925086,https://api.github.com/repos/pydata/xarray/issues/1895,363925086,MDEyOklzc3VlQ29tbWVudDM2MzkyNTA4Ng==,306380,2018-02-07T21:59:28Z,2018-02-07T21:59:28Z,MEMBER,Do these objects happen to store any cached results? I'm seeing odd performance issues around these objects and am curious about any ways in which they might be fancy.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363921064,https://api.github.com/repos/pydata/xarray/issues/1895,363921064,MDEyOklzc3VlQ29tbWVudDM2MzkyMTA2NA==,1217238,2018-02-07T21:44:33Z,2018-02-07T21:44:33Z,MEMBER,"> In principle this is fine, especially if this object is cheap to serialize, move, and deserialize.
Yes, that should be the case here. Each of these array objects is very lightweight and should be quickly pickled/unpickled.
On the other hand, once evaluated these do correspond to a large chunk of data (entire arrays). If this future needs to be evaluated before being passed around that would be a problem. Getitem fusing is pretty essential here for performance.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362
https://github.com/pydata/xarray/issues/1895#issuecomment-363889835,https://api.github.com/repos/pydata/xarray/issues/1895,363889835,MDEyOklzc3VlQ29tbWVudDM2Mzg4OTgzNQ==,306380,2018-02-07T19:52:18Z,2018-02-07T19:52:18Z,MEMBER,"https://github.com/pangeo-data/pangeo/issues/99#issuecomment-363852820
also cc @jhamman ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,295270362