home / github / issue_comments

Menu
  • Search all tables
  • GraphQL API

issue_comments: 416443277

This data as json

html_url issue_url id node_id user created_at updated_at author_association body reactions performed_via_github_app issue
https://github.com/pydata/xarray/pull/2261#issuecomment-416443277 https://api.github.com/repos/pydata/xarray/issues/2261 416443277 MDEyOklzc3VlQ29tbWVudDQxNjQ0MzI3Nw== 1217238 2018-08-28T03:57:54Z 2018-08-28T03:57:54Z MEMBER

I just ran the benchmark suite again and now see improvement across the board: before after ratio [0b9ab2d1] [6350ca6f] - 1.49s 1.35s 0.91 dataset_io.IOReadMultipleNetCDF4Dask.time_load_dataset_netcdf4_with_block_chunks_multiprocessing - 79.96ms 72.36ms 0.90 dataset_io.IOReadSingleNetCDF3.time_load_dataset_netcdf4 - 29.61ms 26.17ms 0.88 dataset_io.IOReadSingleNetCDF3.time_orthogonal_indexing - 238.97ms 210.33ms 0.88 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_netcdf4_with_time_chunks - 154.84ms 133.97ms 0.87 dataset_io.IOReadSingleNetCDF4Dask.time_load_dataset_netcdf4_with_time_chunks - 3.03s 2.56s 0.85 dataset_io.IOReadSingleNetCDF3Dask.time_load_dataset_scipy_with_block_chunks_oindexing - 458.85ms 377.81ms 0.82 dataset_io.IOReadSingleNetCDF3Dask.time_load_dataset_scipy_with_block_chunks - 21.95ms 17.83ms 0.81 dataset_io.IOReadSingleNetCDF3.time_vectorized_indexing - 63.52ms 51.54ms 0.81 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_netcdf4_with_time_chunks - 79.17ms 63.31ms 0.80 dataset_io.IOReadMultipleNetCDF4.time_open_dataset_netcdf4 - 75.62ms 59.49ms 0.79 dataset_io.IOReadMultipleNetCDF4Dask.time_open_dataset_netcdf4_with_block_chunks_multiprocessing - 650.58ms 502.08ms 0.77 dataset_io.IOReadMultipleNetCDF4.time_load_dataset_netcdf4 - 75.90ms 58.50ms 0.77 dataset_io.IOReadMultipleNetCDF4Dask.time_open_dataset_netcdf4_with_time_chunks_multiprocessing - 687.07ms 527.76ms 0.77 dataset_io.IOReadMultipleNetCDF4Dask.time_load_dataset_netcdf4_with_block_chunks - 65.15ms 49.77ms 0.76 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_netcdf4_with_block_chunks - 86.80ms 65.68ms 0.76 dataset_io.IOReadMultipleNetCDF4Dask.time_open_dataset_netcdf4_with_block_chunks - 58.60ms 43.81ms 0.75 dataset_io.IOReadMultipleNetCDF3.time_open_dataset_netcdf4 - 1.43s 1.07s 0.75 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_netcdf4_with_block_chunks_multiprocessing - 80.01ms 57.88ms 0.72 dataset_io.IOReadMultipleNetCDF4Dask.time_open_dataset_netcdf4_with_time_chunks - 1.16s 834.07ms 0.72 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_netcdf4_with_time_chunks_multiprocessing - 177.43ms 126.31ms 0.71 dataset_io.IOReadSingleNetCDF3Dask.time_load_dataset_scipy_with_block_chunks_vindexing - 135.28ms 93.70ms 0.69 dataset_io.IOReadSingleNetCDF3.time_load_dataset_scipy - 62.89ms 43.38ms 0.69 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_netcdf4_with_time_chunks_multiprocessing - 77.04ms 52.70ms 0.68 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_netcdf4_with_block_chunks_multiprocessing - 324.10ms 221.52ms 0.68 dataset_io.IOReadMultipleNetCDF4Dask.time_load_dataset_netcdf4_with_time_chunks - 1.28s 812.88ms 0.63 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_scipy_with_time_chunks - 797.18ms 503.38ms 0.63 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_netcdf4_with_block_chunks - 1.66s 1.04s 0.63 dataset_io.IOReadMultipleNetCDF3Dask.time_load_dataset_scipy_with_block_chunks - 98.57ms 56.60ms 0.57 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_scipy_with_block_chunks - 98.12ms 54.05ms 0.55 dataset_io.IOReadMultipleNetCDF3Dask.time_open_dataset_scipy_with_time_chunks - 810.75ms 436.98ms 0.54 dataset_io.IOReadMultipleNetCDF3.time_load_dataset_scipy - 105.06ms 50.71ms 0.48 dataset_io.IOReadMultipleNetCDF3.time_open_dataset_scipy - 608.23ms 231.53ms 0.38 dataset_io.IOReadMultipleNetCDF3.time_load_dataset_netcdf4 There's pretty clearly high-variance on this benchmarks.

I considered adding another benchmark with dask-distributed, but the numbers look very similar to those for multi-processing or threads. It doesn't seem to provide a useful additional signal and makes the whole IO benchmarking suite run about 30% slower to add the distributed tests.

{
    "total_count": 0,
    "+1": 0,
    "-1": 0,
    "laugh": 0,
    "hooray": 0,
    "confused": 0,
    "heart": 0,
    "rocket": 0,
    "eyes": 0
}
  337267315
Powered by Datasette · Queries took 0.719ms · About: xarray-datasette