• Mikulas Patocka's avatar
    x86/asm: Optimize memcpy_flushcache() · 02101c45
    Mikulas Patocka authored
    I use memcpy_flushcache() in my persistent memory driver for metadata
    updates, there are many 8-byte and 16-byte updates and it turns out that
    the overhead of memcpy_flushcache causes 2% performance degradation
    compared to "movnti" instruction explicitly coded using inline assembler.
    
    The tests were done on a Skylake processor with persistent memory emulated
    using the "memmap" kernel parameter. dd was used to copy data to the
    dm-writecache target.
    
    This patch recognizes memcpy_flushcache calls with constant short length
    and turns them into inline assembler - so that I don't have to use inline
    assembler in the driver.
    Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
    Cc: Dan Williams <dan.j.williams@intel.com>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Mike Snitzer <snitzer@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Cc: device-mapper development <dm-devel@redhat.com>
    Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1808081720460.24747@file01.intranet.prod.int.rdu2.redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    02101c45
string_64.h 4.54 KB