• Jens Axboe's avatar
    block: only check previous entry for plug merge attempt · d38a9c04
    Jens Axboe authored
    Currently we scan the entire plug list, which is potentially very
    expensive. In an IOPS bound workload, we can drive about 5.6M IOPS with
    merging enabled, and profiling shows that the plug merge check is the
    (by far) most expensive thing we're doing:
    
      Overhead  Command   Shared Object     Symbol
      +   20.89%  io_uring  [kernel.vmlinux]  [k] blk_attempt_plug_merge
      +    4.98%  io_uring  [kernel.vmlinux]  [k] io_submit_sqes
      +    4.78%  io_uring  [kernel.vmlinux]  [k] blkdev_direct_IO
      +    4.61%  io_uring  [kernel.vmlinux]  [k] blk_mq_submit_bio
    
    Instead of browsing the whole list, just check the previously inserted
    entry. That is enough for a naive merge check and will catch most cases,
    and for devices that need full merging, the IO scheduler attached to
    such devices will do that anyway. The plug merge is meant to be an
    inexpensive check to avoid getting a request, but if we repeatedly
    scan the list for every single insert, it is very much not a cheap
    check.
    
    With this patch, the workload instead runs at ~7.0M IOPS, providing
    a 25% improvement. Disabling merging entirely yields another 5%
    improvement.
    Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    d38a9c04
blk-merge.c 31.6 KB