html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/pull/2538#issuecomment-435738466,https://api.github.com/repos/pydata/xarray/issues/2538,435738466,MDEyOklzc3VlQ29tbWVudDQzNTczODQ2Ng==,2443309,2018-11-05T02:39:50Z,2018-11-05T02:39:50Z,MEMBER,@shoyer - I think I was tracking with you. I've gone ahead and deprecated the current `load_dataset` in favor of the `open_dataset` name. The switch is accompanied by a change in behavior as well.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253
https://github.com/pydata/xarray/pull/2538#issuecomment-435732988,https://api.github.com/repos/pydata/xarray/issues/2538,435732988,MDEyOklzc3VlQ29tbWVudDQzNTczMjk4OA==,1217238,2018-11-05T01:59:34Z,2018-11-05T01:59:34Z,MEMBER,"> The default behavior should cache the arrays loaded with NumPy anyways.
Sorry, to be clear what I meant here is that by default arrays loaded with NumPy get cached after the first/access/operation. Not that we need to preserve the existing behavior of `load_dataset()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253
https://github.com/pydata/xarray/pull/2538#issuecomment-435688958,https://api.github.com/repos/pydata/xarray/issues/2538,435688958,MDEyOklzc3VlQ29tbWVudDQzNTY4ODk1OA==,1217238,2018-11-04T17:29:11Z,2018-11-04T17:29:11Z,MEMBER,"OK, that seems reasonable. The default behavior should cache the arrays
loaded with NumPy anyways. I would not be opposed to renaming this to
open_dataset, either.
On Sun, Nov 4, 2018 at 9:19 AM Joe Hamman wrote:
> @shoyer - absolutely we'll get better
> performance with numpy arrays in this case. So I'm trying to use our
> tutorial datasets for some examples with dask (dask/dask-examples#51
> ). The docstring for the
> load_dataset function states that we can pass kwargs on to the
> open_dataset function but if we pass chunks to the load_dataset call
> currently, we still get data back as numpy arrays. We have some other
> options here:
>
> 1. if chunks is a kwargs, return a dataset with data as persisted dask
> arrays
> 2. provide a second function to handle returning datasets using the
> same logic as open_dataset (caching, dask arrays, lazy loading, etc.)
> 3. tell people (like me) to rechunk the dataset after the fact
>
> (3) won't require any changes but makes it a little harder to connect the
> typical use pattern of open_dataset with tutorial.load_dataset.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253
https://github.com/pydata/xarray/pull/2538#issuecomment-435688104,https://api.github.com/repos/pydata/xarray/issues/2538,435688104,MDEyOklzc3VlQ29tbWVudDQzNTY4ODEwNA==,2443309,2018-11-04T17:19:15Z,2018-11-04T17:19:15Z,MEMBER,"@shoyer - absolutely we'll get better performance with numpy arrays in this case. So I'm trying to use our tutorial datasets for some examples with dask (dask/dask-examples#51). The docstring for the `load_dataset` function states that we can pass kwargs on to the `open_dataset` function but if we pass `chunks` to the `load_dataset` call currently, we still get data back as numpy arrays. We have some other options here:
1. if chunks is a kwargs, return a dataset with data as persisted dask arrays
2. provide a second function to handle returning datasets using the same logic as `open_dataset` (caching, dask arrays, lazy loading, etc.)
3. tell people (like me) to rechunk the dataset after the fact
(3) won't require any changes but makes it a little harder to connect the typical use pattern of `open_dataset` with `tutorial.load_dataset`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253
https://github.com/pydata/xarray/pull/2538#issuecomment-435621566,https://api.github.com/repos/pydata/xarray/issues/2538,435621566,MDEyOklzc3VlQ29tbWVudDQzNTYyMTU2Ng==,1217238,2018-11-03T21:17:02Z,2018-11-03T21:17:02Z,MEMBER,"Our current tutorial datasets are 8MB and 17MB, which is pretty small. You'll definitely get better performance loading datasets of this size into NumPy arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253