• Waiman Long's avatar
    locking/rwsem: Optimize rwsem structure for uncontended lock acquisition · 364f784f
    Waiman Long authored
    For an uncontended rwsem, count and owner are the only fields a task
    needs to touch when acquiring the rwsem. So they are put next to each
    other to increase the chance that they will share the same cacheline.
    
    On a ThunderX2 99xx (arm64) system with 32K L1 cache and 256K L2
    cache, a rwsem locking microbenchmark with one locking thread was
    run to write-lock and write-unlock an array of rwsems separated 2
    cachelines apart in a 1M byte memory block. The locking rates (kops/s)
    of the microbenchmark when the rwsems are at various "long" (8-byte)
    offsets from beginning of the cacheline before and after the patch were
    as follows:
    
      Cacheline Offset   Pre-patch    Post-patch
      ----------------   ---------    ----------
            0             17,449        16,588
            1             17,450        16,465
    	2             17,450        16,460
    	3             17,453        16,462
    	4             14,867        16,471
    	5             14,867        16,470
    	6             14,853        16,464
    	7             14,867        13,172
    
    Before the patch, the count and owner are 4 "long"s apart. After the
    patch, they are only 1 "long" apart.
    
    The rwsem data have to be loaded from the L3 cache for each access. It
    can be seen that the locking rates are more consistent after the patch
    than before. Note that for this particular system, the performance
    drop happens whenever the count and owner are at an odd multiples of
    "long"s apart. No performance drop was observed when only a single rwsem
    was used (hot cache). So the drop is likely just an idiosyncrasy of the
    cache architecture of this chip than an inherent problem with the patch.
    Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarWaiman Long <longman@redhat.com>
    Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Arnd Bergmann <arnd@arndb.de>
    Cc: Borislav Petkov <bp@alien8.de>
    Cc: Davidlohr Bueso <dave@stgolabs.net>
    Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: Tim Chen <tim.c.chen@linux.intel.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Link: http://lkml.kernel.org/r/20190404174320.22416-12-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    364f784f
rwsem.h 5.84 KB