html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/pull/2538#issuecomment-435738466,https://api.github.com/repos/pydata/xarray/issues/2538,435738466,MDEyOklzc3VlQ29tbWVudDQzNTczODQ2Ng==,2443309,2018-11-05T02:39:50Z,2018-11-05T02:39:50Z,MEMBER,@shoyer - I think I was tracking with you. I've gone ahead and deprecated the current `load_dataset` in favor of the `open_dataset` name. The switch is accompanied by a change in behavior as well.,"{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253 https://github.com/pydata/xarray/pull/2538#issuecomment-435732988,https://api.github.com/repos/pydata/xarray/issues/2538,435732988,MDEyOklzc3VlQ29tbWVudDQzNTczMjk4OA==,1217238,2018-11-05T01:59:34Z,2018-11-05T01:59:34Z,MEMBER,"> The default behavior should cache the arrays loaded with NumPy anyways. Sorry, to be clear what I meant here is that by default arrays loaded with NumPy get cached after the first/access/operation. Not that we need to preserve the existing behavior of `load_dataset()`.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253 https://github.com/pydata/xarray/pull/2538#issuecomment-435688958,https://api.github.com/repos/pydata/xarray/issues/2538,435688958,MDEyOklzc3VlQ29tbWVudDQzNTY4ODk1OA==,1217238,2018-11-04T17:29:11Z,2018-11-04T17:29:11Z,MEMBER,"OK, that seems reasonable. The default behavior should cache the arrays loaded with NumPy anyways. I would not be opposed to renaming this to open_dataset, either. On Sun, Nov 4, 2018 at 9:19 AM Joe Hamman wrote: > @shoyer - absolutely we'll get better > performance with numpy arrays in this case. So I'm trying to use our > tutorial datasets for some examples with dask (dask/dask-examples#51 > ). The docstring for the > load_dataset function states that we can pass kwargs on to the > open_dataset function but if we pass chunks to the load_dataset call > currently, we still get data back as numpy arrays. We have some other > options here: > > 1. if chunks is a kwargs, return a dataset with data as persisted dask > arrays > 2. provide a second function to handle returning datasets using the > same logic as open_dataset (caching, dask arrays, lazy loading, etc.) > 3. tell people (like me) to rechunk the dataset after the fact > > (3) won't require any changes but makes it a little harder to connect the > typical use pattern of open_dataset with tutorial.load_dataset. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 1, ""+1"": 1, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253 https://github.com/pydata/xarray/pull/2538#issuecomment-435688104,https://api.github.com/repos/pydata/xarray/issues/2538,435688104,MDEyOklzc3VlQ29tbWVudDQzNTY4ODEwNA==,2443309,2018-11-04T17:19:15Z,2018-11-04T17:19:15Z,MEMBER,"@shoyer - absolutely we'll get better performance with numpy arrays in this case. So I'm trying to use our tutorial datasets for some examples with dask (dask/dask-examples#51). The docstring for the `load_dataset` function states that we can pass kwargs on to the `open_dataset` function but if we pass `chunks` to the `load_dataset` call currently, we still get data back as numpy arrays. We have some other options here: 1. if chunks is a kwargs, return a dataset with data as persisted dask arrays 2. provide a second function to handle returning datasets using the same logic as `open_dataset` (caching, dask arrays, lazy loading, etc.) 3. tell people (like me) to rechunk the dataset after the fact (3) won't require any changes but makes it a little harder to connect the typical use pattern of `open_dataset` with `tutorial.load_dataset`. ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253 https://github.com/pydata/xarray/pull/2538#issuecomment-435621566,https://api.github.com/repos/pydata/xarray/issues/2538,435621566,MDEyOklzc3VlQ29tbWVudDQzNTYyMTU2Ng==,1217238,2018-11-03T21:17:02Z,2018-11-03T21:17:02Z,MEMBER,"Our current tutorial datasets are 8MB and 17MB, which is pretty small. You'll definitely get better performance loading datasets of this size into NumPy arrays.","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,377075253