html_url,issue_url,id,node_id,user,created_at,updated_at,author_association,body,reactions,performed_via_github_app,issue
https://github.com/pydata/xarray/issues/1385#issuecomment-461554066,https://api.github.com/repos/pydata/xarray/issues/1385,461554066,MDEyOklzc3VlQ29tbWVudDQ2MTU1NDA2Ng==,35968931,2019-02-07T19:00:57Z,2019-02-07T19:00:57Z,MEMBER,"Looks like you're using xarray v0.11.0, but the most recent one is v0.11.3.
There have been several changes since then which might affect this, try
that first.
On Thu, 7 Feb 2019, 18:53 sbiner, wrote:
> I have the same problem. open_mfdatasset is 10X slower than nc.MFDataset.
> I used the following code to get some timing on opening 456 local netcdf
> files located in a nc_local directory (of total size of 532MB)
>
> clef = 'nc_local/*.nc'
> t00 = time.time()
> l_fichiers_nc = sorted(glob.glob(clef))
> print ('timing glob: {:6.2f}s'.format(time.time()-t00))
>
> # netcdf4
> t00 = time.time()
> ds1 = nc.MFDataset(l_fichiers_nc)
> #dates1 = ouralib.netcdf.calcule_dates(ds1)
> print ('timing netcdf4: {:6.2f}s'.format(time.time()-t00))
>
> # xarray
> t00 = time.time()
> ds2 = xr.open_mfdataset(l_fichiers_nc)
> print ('timing xarray: {:6.2f}s'.format(time.time()-t00))
>
> # xarray tune
> t00 = time.time()
> ds3 = xr.open_mfdataset(l_fichiers_nc, decode_cf=False, concat_dim='time')
> ds3 = xr.decode_cf(ds3)
> print ('timing xarray tune: {:6.2f}s'.format(time.time()-t00))
>
> The output I get is :
>
> timing glob: 0.00s
> timing netcdf4: 3.80s
> timing xarray: 44.60s
> timing xarray tune: 15.61s
>
> I made tests on a centOS server using python2.7 and 3.6, and on mac OS as
> well with python3.6. The timing changes but the ratios are similar between
> netCDF4 and xarray.
>
> Is there any way of making open_mfdataset go faster?
>
> In case it helps, here are output from xr.show_versions and %prun
> xr.open_mfdataset(l_fichiers_nc). I do not know anything about the output
> of %prun but I have noticed that the first two lines of the ouput are
> different wether I'm using python 2.7 or python 3.6. I made those tests on
> centOS and macOS with anaconda environments.
>
> for python 2.7:
>
> 13996351 function calls (13773659 primitive calls) in 42.133 seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 2664 16.290 0.006 16.290 0.006 {time.sleep}
> 912 6.330 0.007 6.623 0.007 netCDF4_.py:244(_open_netcdf4_group)
>
> for python 3.6:
>
> 9663408 function calls (9499759 primitive calls) in 31.934 seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 5472 15.140 0.003 15.140 0.003 {method 'acquire' of '_thread.lock' objects}
> 912 5.661 0.006 5.718 0.006 netCDF4_.py:244(_open_netcdf4_group)
>
> longer output of %prun with python3.6:
>
> 9663408 function calls (9499759 primitive calls) in 31.934 seconds
>
> Ordered by: internal time
>
> ncalls tottime percall cumtime percall filename:lineno(function)
> 5472 15.140 0.003 15.140 0.003 {method 'acquire' of '_thread.lock' objects}
> 912 5.661 0.006 5.718 0.006 netCDF4_.py:244(_open_netcdf4_group)
> 4104 0.564 0.000 0.757 0.000 {built-in method _operator.getitem}
> 133152/129960 0.477 0.000 0.660 0.000 indexing.py:496(shape)
> 1554550/1554153 0.414 0.000 0.711 0.000 {built-in method builtins.isinstance}
> 912 0.260 0.000 0.260 0.000 {method 'close' of 'netCDF4._netCDF4.Dataset' objects}
> 6384 0.244 0.000 0.953 0.000 netCDF4_.py:361(open_store_variable)
> 910 0.241 0.000 0.595 0.001 duck_array_ops.py:141(array_equiv)
> 20990 0.235 0.000 0.343 0.000 {pandas._libs.lib.is_scalar}
> 37483/36567 0.228 0.000 0.230 0.000 {built-in method builtins.iter}
> 93986 0.219 0.000 1.607 0.000 variable.py:239(__init__)
> 93982 0.194 0.000 0.194 0.000 variable.py:706(attrs)
> 33744 0.189 0.000 0.189 0.000 {method 'getncattr' of 'netCDF4._netCDF4.Variable' objects}
> 15511 0.175 0.000 0.638 0.000 core.py:1776(normalize_chunks)
> 5930 0.162 0.000 0.350 0.000 missing.py:183(_isna_ndarraylike)
> 297391/296926 0.159 0.000 0.380 0.000 {built-in method builtins.getattr}
> 134230 0.155 0.000 0.269 0.000 abc.py:180(__instancecheck__)
> 6384 0.142 0.000 0.199 0.000 netCDF4_.py:34(__init__)
> 93986 0.126 0.000 0.671 0.000 variable.py:414(_parse_dimensions)
> 156545 0.119 0.000 0.811 0.000 utils.py:450(ndim)
> 12768 0.119 0.000 0.203 0.000 core.py:747(blockdims_from_blockshape)
> 6384 0.117 0.000 2.526 0.000 conventions.py:245(decode_cf_variable)
> 741183/696380 0.116 0.000 0.134 0.000 {built-in method builtins.len}
> 41957/23717 0.110 0.000 4.395 0.000 {built-in method numpy.core.multiarray.array}
> 93978 0.110 0.000 0.110 0.000 variable.py:718(encoding)
> 219940 0.109 0.000 0.109 0.000 _weakrefset.py:70(__contains__)
> 99458 0.100 0.000 0.440 0.000 variable.py:137(as_compatible_data)
> 53882 0.085 0.000 0.095 0.000 core.py:891(shape)
> 140604 0.084 0.000 0.628 0.000 variable.py:272(shape)
> 3192 0.084 0.000 0.170 0.000 utils.py:88(_StartCountStride)
> 10494 0.081 0.000 0.081 0.000 {method 'reduce' of 'numpy.ufunc' objects}
> 44688 0.077 0.000 0.157 0.000 variables.py:102(unpack_for_decoding)
>
> output of xr.show_versions()
>
> xr.show_versions()
>
> INSTALLED VERSIONS
> ------------------
> commit: None
> python: 3.6.8.final.0
> python-bits: 64
> OS: Linux
> OS-release: 3.10.0-514.2.2.el7.x86_64
> machine: x86_64
> processor: x86_64
> byteorder: little
> LC_ALL: None
> LANG: en_CA.UTF-8
> LOCALE: en_CA.UTF-8
>
> xarray: 0.11.0
> pandas: 0.24.1
> numpy: 1.15.4
> scipy: None
> netCDF4: 1.4.2
> h5netcdf: None
> h5py: None
> Nio: None
> zarr: None
> cftime: 1.0.3.4
> PseudonetCDF: None
> rasterio: None
> iris: None
> bottleneck: None
> cyordereddict: None
> dask: 1.1.1
> distributed: 1.25.3
> matplotlib: 3.0.2
> cartopy: None
> seaborn: None
> setuptools: 40.7.3
> pip: 19.0.1
> conda: None
> pytest: None
> IPython: 7.2.0
> sphinx: None
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> , or mute
> the thread
>
> .
>
","{""total_count"": 0, ""+1"": 0, ""-1"": 0, ""laugh"": 0, ""hooray"": 0, ""confused"": 0, ""heart"": 0, ""rocket"": 0, ""eyes"": 0}",,224553135