• Nicholas Piggin's avatar
    percpu: improve generic percpu modify-return implementation · 1b5ca121
    Nicholas Piggin authored
    Some architectures require an additional load to find the address of
    percpu pointers. In some implemenatations, the C aliasing rules do not
    allow the result of that load to be kept over the store that modifies
    the percpu variable, which causes additional loads.
    
    Work around this by finding the pointer first, then operating on that.
    
    It's also possible to mark things as restrict and those kind of games,
    but that can require larger and arch specific changes.
    
    On powerpc, __this_cpu_inc_return compiles to:
    
            ld 10,48(13)
            ldx 9,3,10
            addi 9,9,1
            stdx 9,3,10
            ld 9,48(13)
            ldx 3,9,3
    
    With this patch it compiles to:
    
            ld 10,48(13)
            ldx 9,3,10
            addi 9,9,1
            stdx 9,3,10
    Signed-off-by: default avatarNicholas Piggin <npiggin@gmail.com>
    To: Tejun Heo <tj@kernel.org>
    To: Christoph Lameter <cl@linux.com>
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    1b5ca121
percpu.h 12.2 KB