• Linus Torvalds's avatar
    fs: only do a memory barrier for the first set_buffer_uptodate() · 2f79cdfe
    Linus Torvalds authored
    Commit d4252071 ("add barriers to buffer_uptodate and
    set_buffer_uptodate") added proper memory barriers to the buffer head
    BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
    will be guaranteed to actually see initialized state.
    
    However, that commit didn't _just_ add the memory barrier, it also ended
    up dropping the "was it already set" logic that the BUFFER_FNS() macro
    had.
    
    That's conceptually the right thing for a generic "this is a memory
    barrier" operation, but in the case of the buffer contents, we really
    only care about the memory barrier for the _first_ time we set the bit,
    in that the only memory ordering protection we need is to avoid anybody
    seeing uninitialized memory contents.
    
    Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
    and would require some other proper lock (typically BH_Lock or the folio
    lock).  A reader that races with somebody invalidating the buffer head
    isn't an issue wrt the memory ordering, it's a serialization issue.
    
    Now, you'd think that the buffer head operations don't matter in this
    day and age (and I certainly thought so), but apparently some loads
    still end up being heavy users of buffer heads.  In particular, the
    kernel test robot reported that not having this bit access optimization
    in place caused a noticeable direct IO performance regression on ext4:
    
      fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression
    
    although you presumably need a fast disk and a lot of cores to actually
    notice.
    
    Link: https://lore.kernel.org/all/Yw8L7HTZ%2FdE2%2Fo9C@xsang-OptiPlex-9020/
    
    
    Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
    Tested-by: default avatarFengwei Yin <fengwei.yin@intel.com>
    Cc: Mikulas Patocka <mpatocka@redhat.com>
    Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
    Cc: stable@kernel.org
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    2f79cdfe