id,node_id,number,title,user,state,locked,assignee,milestone,comments,created_at,updated_at,closed_at,author_association,active_lock_reason,draft,pull_request,body,reactions,performed_via_github_app,state_reason,repo,type
504497403,MDU6SXNzdWU1MDQ0OTc0MDM=,3386,add option to open_mfdataset for not using dask,42270910,closed,0,,,6,2019-10-09T08:33:53Z,2022-04-09T01:16:21Z,2022-04-09T01:16:21Z,NONE,,,,"open_mfdataset only works with dask, whereas with open_dataset one can choose to use dask or not. It would be nice have an option (e.g. use_dask=False) to not use dask.

My special use-case is the following:
I use netcdf data as input for a tensorflow/keras application.  I use parallel preprocessing threads in Keras. When using dask arrays, it gets complicated because both dask and tensorflow work with threads. I do not need any processing capability of dask/xarray, I only need a lazily loaded array that I can slice, and where the slices are loaded the moment they are accessed. So my application works nice with open_dataset (without defining chunks, and thus not using dask, but the data is accessed slice by slice, so it is never loaded as a whole into memory). However, it would be nice to have the same with open_mfdataset. Right now my workaround is to use netCDF4.MFDataset . (Obviously another workaround would be to concatenate my files into one and use open_dataset)
Opening each file separately with open_dataset, and then concatenating them with xr.concat does not work, as this loads the data into memory.

","{""url"": ""https://api.github.com/repos/pydata/xarray/issues/3386/reactions"", ""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,completed,13221727,issue