• H. Peter Anvin's avatar
    x86, hweight: Use a 32-bit popcnt for __arch_hweight32() · c59bd568
    H. Peter Anvin authored
    Use a 32-bit popcnt instruction for __arch_hweight32(), even on
    x86-64.  Even though the input register will *usually* be
    zero-extended due to the standard operation of the hardware, it isn't
    necessarily so if the input value was the result of truncating a
    64-bit operation.
    
    Note: the POPCNT32 variant used on x86-64 has a technically
    unnecessary REX prefix to make it five bytes long, the same as a CALL
    instruction, therefore avoiding an unnecessary NOP.
    Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
    Cc: Borislav Petkov <borislav.petkov@amd.com>
    LKML-Reference: <alpine.LFD.2.00.1005171443060.4195@i5.linux-foundation.org>
    c59bd568
arch_hweight.h 1.38 KB