• Kairui Song's avatar
    mm/filemap: return early if failed to allocate memory for split · de60fd8d
    Kairui Song authored
    Patch series "mm/filemap: optimize folio adding and splitting", v4.
    
    Currently, at least 3 tree walks are needed for filemap folio adding if
    the folio is previously evicted.  One for getting the order of current
    slot, one for ranged conflict check, and one for another order retrieving.
    If a split is needed, more walks are needed.
    
    This series is trying to merge these walks, and speed up
    filemap_add_folio, I see a 7.5% - 12.5% performance gain for fio stress
    test.
    
    So instead of doing multiple tree walks, do one optimism range check with
    lock hold, and exit if raced with another insertion.  If a shadow exists,
    check it with a new xas_get_order helper before releasing the lock to
    avoid redundant tree walks for getting its order.
    
    Drop the lock and do the allocation only if a split is needed.
    
    In the best case, it only need to walk the tree once.  If it needs to
    alloc and split, 3 walks are issued (One for first ranged conflict check
    and order retrieving, one for the second check after allocation, one for
    the insert after split).
    
    Testing with 4K pages, in an 8G cgroup, with 16G brd as block device:
    
      echo 3 > /proc/sys/vm/drop_caches
    
      fio -name=cached --numjobs=16 --filename=/mnt/test.img \
        --buffered=1 --ioengine=mmap --rw=randread --time_based \
        --ramp_time=30s --runtime=5m --group_reporting
    
    Before:
    bw (  MiB/s): min= 1027, max= 3520, per=100.00%, avg=2445.02, stdev=18.90, samples=8691
    iops        : min=263001, max=901288, avg=625924.36, stdev=4837.28, samples=8691
    
    After (+7.3%):
    bw (  MiB/s): min=  493, max= 3947, per=100.00%, avg=2625.56, stdev=25.74, samples=8651
    iops        : min=126454, max=1010681, avg=672142.61, stdev=6590.48, samples=8651
    
    Test result with THP (do a THP randread then switch to 4K page in hope it
    issues a lot of splitting):
    
      echo 3 > /proc/sys/vm/drop_caches
    
      fio -name=cached --numjobs=16 --filename=/mnt/test.img \
          --buffered=1 --ioengine=mmap -thp=1 --readonly \
          --rw=randread --time_based --ramp_time=30s --runtime=10m \
          --group_reporting
    
      fio -name=cached --numjobs=16 --filename=/mnt/test.img \
          --buffered=1 --ioengine=mmap \
          --rw=randread --time_based --runtime=5s --group_reporting
    
    Before:
    bw (  KiB/s): min= 4141, max=14202, per=100.00%, avg=7935.51, stdev=96.85, samples=18976
    iops        : min= 1029, max= 3548, avg=1979.52, stdev=24.23, samples=18976·
    
    READ: bw=4545B/s (4545B/s), 4545B/s-4545B/s (4545B/s-4545B/s), io=64.0KiB (65.5kB), run=14419-14419msec
    
    After (+10.4%):
    bw (  KiB/s): min= 4611, max=15370, per=100.00%, avg=8928.74, stdev=105.17, samples=19146
    iops        : min= 1151, max= 3842, avg=2231.27, stdev=26.29, samples=19146
    
    READ: bw=4635B/s (4635B/s), 4635B/s-4635B/s (4635B/s-4635B/s), io=64.0KiB (65.5kB), run=14137-14137msec
    
    The performance is better for both 4K (+7.5%) and THP (+12.5%) cached read.
    
    
    This patch (of 4):
    
    xas_split_alloc could fail with NOMEM, and in such case, it should abort
    early instead of keep going and fail the xas_split below.
    
    Link: https://lkml.kernel.org/r/20240416071722.45997-1-ryncsn@gmail.com
    Link: https://lkml.kernel.org/r/20240415171857.19244-1-ryncsn@gmail.com
    Link: https://lkml.kernel.org/r/20240415171857.19244-2-ryncsn@gmail.comSigned-off-by: default avatarKairui Song <kasong@tencent.com>
    Acked-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    de60fd8d
filemap.c 122 KB