html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue https://github.com/pydata/xarray/issues/1385#issuecomment-461554066,https://api.github.com/repos/pydata/xarray/issues/1385,461554066,MDEyOklzc3VlQ29tbWVudDQ2MTU1NDA2Ng==,35968931,2019-02-07T19:00:57Z,2019-02-07T19:00:57Z,MEMBER,"Looks like you're using xarray v0.11.0, but the most recent one is v0.11.3. There have been several changes since then which might affect this, try that first. On Thu, 7 Feb 2019, 18:53 sbiner, wrote: > I have the same problem. open_mfdatasset is 10X slower than nc.MFDataset. > I used the following code to get some timing on opening 456 local netcdf > files located in a nc_local directory (of total size of 532MB) > > clef = 'nc_local/*.nc' > t00 = time.time() > l_fichiers_nc = sorted(glob.glob(clef)) > print ('timing glob: {:6.2f}s'.format(time.time()-t00)) > > # netcdf4 > t00 = time.time() > ds1 = nc.MFDataset(l_fichiers_nc) > #dates1 = ouralib.netcdf.calcule_dates(ds1) > print ('timing netcdf4: {:6.2f}s'.format(time.time()-t00)) > > # xarray > t00 = time.time() > ds2 = xr.open_mfdataset(l_fichiers_nc) > print ('timing xarray: {:6.2f}s'.format(time.time()-t00)) > > # xarray tune > t00 = time.time() > ds3 = xr.open_mfdataset(l_fichiers_nc, decode_cf=False, concat_dim='time') > ds3 = xr.decode_cf(ds3) > print ('timing xarray tune: {:6.2f}s'.format(time.time()-t00)) > > The output I get is : > > timing glob: 0.00s > timing netcdf4: 3.80s > timing xarray: 44.60s > timing xarray tune: 15.61s > > I made tests on a centOS server using python2.7 and 3.6, and on mac OS as > well with python3.6. The timing changes but the ratios are similar between > netCDF4 and xarray. > > Is there any way of making open_mfdataset go faster? > > In case it helps, here are output from xr.show_versions and %prun > xr.open_mfdataset(l_fichiers_nc). I do not know anything about the output > of %prun but I have noticed that the first two lines of the ouput are > different wether I'm using python 2.7 or python 3.6. I made those tests on > centOS and macOS with anaconda environments. > > for python 2.7: > > 13996351 function calls (13773659 primitive calls) in 42.133 seconds > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 2664 16.290 0.006 16.290 0.006 {time.sleep} > 912 6.330 0.007 6.623 0.007 netCDF4_.py:244(_open_netcdf4_group) > > for python 3.6: > > 9663408 function calls (9499759 primitive calls) in 31.934 seconds > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 5472 15.140 0.003 15.140 0.003 {method 'acquire' of '_thread.lock' objects} > 912 5.661 0.006 5.718 0.006 netCDF4_.py:244(_open_netcdf4_group) > > longer output of %prun with python3.6: > > 9663408 function calls (9499759 primitive calls) in 31.934 seconds > > Ordered by: internal time > > ncalls tottime percall cumtime percall filename:lineno(function) > 5472 15.140 0.003 15.140 0.003 {method 'acquire' of '_thread.lock' objects} > 912 5.661 0.006 5.718 0.006 netCDF4_.py:244(_open_netcdf4_group) > 4104 0.564 0.000 0.757 0.000 {built-in method _operator.getitem} > 133152/129960 0.477 0.000 0.660 0.000 indexing.py:496(shape) > 1554550/1554153 0.414 0.000 0.711 0.000 {built-in method builtins.isinstance} > 912 0.260 0.000 0.260 0.000 {method 'close' of 'netCDF4._netCDF4.Dataset' objects} > 6384 0.244 0.000 0.953 0.000 netCDF4_.py:361(open_store_variable) > 910 0.241 0.000 0.595 0.001 duck_array_ops.py:141(array_equiv) > 20990 0.235 0.000 0.343 0.000 {pandas._libs.lib.is_scalar} > 37483/36567 0.228 0.000 0.230 0.000 {built-in method builtins.iter} > 93986 0.219 0.000 1.607 0.000 variable.py:239(__init__) > 93982 0.194 0.000 0.194 0.000 variable.py:706(attrs) > 33744 0.189 0.000 0.189 0.000 {method 'getncattr' of 'netCDF4._netCDF4.Variable' objects} > 15511 0.175 0.000 0.638 0.000 core.py:1776(normalize_chunks) > 5930 0.162 0.000 0.350 0.000 missing.py:183(_isna_ndarraylike) > 297391/296926 0.159 0.000 0.380 0.000 {built-in method builtins.getattr} > 134230 0.155 0.000 0.269 0.000 abc.py:180(__instancecheck__) > 6384 0.142 0.000 0.199 0.000 netCDF4_.py:34(__init__) > 93986 0.126 0.000 0.671 0.000 variable.py:414(_parse_dimensions) > 156545 0.119 0.000 0.811 0.000 utils.py:450(ndim) > 12768 0.119 0.000 0.203 0.000 core.py:747(blockdims_from_blockshape) > 6384 0.117 0.000 2.526 0.000 conventions.py:245(decode_cf_variable) > 741183/696380 0.116 0.000 0.134 0.000 {built-in method builtins.len} > 41957/23717 0.110 0.000 4.395 0.000 {built-in method numpy.core.multiarray.array} > 93978 0.110 0.000 0.110 0.000 variable.py:718(encoding) > 219940 0.109 0.000 0.109 0.000 _weakrefset.py:70(__contains__) > 99458 0.100 0.000 0.440 0.000 variable.py:137(as_compatible_data) > 53882 0.085 0.000 0.095 0.000 core.py:891(shape) > 140604 0.084 0.000 0.628 0.000 variable.py:272(shape) > 3192 0.084 0.000 0.170 0.000 utils.py:88(_StartCountStride) > 10494 0.081 0.000 0.081 0.000 {method 'reduce' of 'numpy.ufunc' objects} > 44688 0.077 0.000 0.157 0.000 variables.py:102(unpack_for_decoding) > > output of xr.show_versions() > > xr.show_versions() > > INSTALLED VERSIONS > ------------------ > commit: None > python: 3.6.8.final.0 > python-bits: 64 > OS: Linux > OS-release: 3.10.0-514.2.2.el7.x86_64 > machine: x86_64 > processor: x86_64 > byteorder: little > LC_ALL: None > LANG: en_CA.UTF-8 > LOCALE: en_CA.UTF-8 > > xarray: 0.11.0 > pandas: 0.24.1 > numpy: 1.15.4 > scipy: None > netCDF4: 1.4.2 > h5netcdf: None > h5py: None > Nio: None > zarr: None > cftime: 1.0.3.4 > PseudonetCDF: None > rasterio: None > iris: None > bottleneck: None > cyordereddict: None > dask: 1.1.1 > distributed: 1.25.3 > matplotlib: 3.0.2 > cartopy: None > seaborn: None > setuptools: 40.7.3 > pip: 19.0.1 > conda: None > pytest: None > IPython: 7.2.0 > sphinx: None > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > , or mute > the thread > > . > ","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,224553135