• Robin Holt's avatar
    x86, pat: Update the page flags for memtype atomically instead of using memtype_lock · 1f9cc3cb
    Robin Holt authored
    While testing an application using the xpmem (out of kernel) driver, we
    noticed a significant page fault rate reduction of x86_64 with respect
    to ia64.  For one test running with 32 cpus, one thread per cpu, it
    took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
    For the same test running on 256 cpus, one thread per cpu, it took 14:48
    to vm_insert_pfn 2 GB worth of pages.
    
    The slowdown was tracked to lookup_memtype which acquires the
    spinlock memtype_lock.  This heavily contended lock was slowing down
    vm_insert_pfn().
    
    With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
    cases take approx 00:01.3 seconds to complete.
    Signed-off-by: default avatarRobin Holt <holt@sgi.com>
    LKML-Reference: <20100423153627.751194346@gulag1.americas.sgi.com>
    Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
    Cc: Rafael Wysocki <rjw@novell.com>
    Reviewed-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
    Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
    1f9cc3cb
cacheflush.h 7.13 KB