• Uros Bizjak's avatar
    locking/atomic/x86: Introduce arch_try_cmpxchg64 · c2df0a6a
    Uros Bizjak authored
    Introduce arch_try_cmpxchg64 for 64-bit and 32-bit targets to improve
    code using cmpxchg64.  On 64-bit targets, the generated assembly improves
    from:
    
      ab:	89 c8                	mov    %ecx,%eax
      ad:	48 89 4c 24 60       	mov    %rcx,0x60(%rsp)
      b2:	83 e0 fd             	and    $0xfffffffd,%eax
      b5:	89 54 24 64          	mov    %edx,0x64(%rsp)
      b9:	88 44 24 60          	mov    %al,0x60(%rsp)
      bd:	48 89 c8             	mov    %rcx,%rax
      c0:	c6 44 24 62 f2       	movb   $0xf2,0x62(%rsp)
      c5:	48 8b 74 24 60       	mov    0x60(%rsp),%rsi
      ca:	f0 49 0f b1 34 24    	lock cmpxchg %rsi,(%r12)
      d0:	48 39 c1             	cmp    %rax,%rcx
      d3:	75 cf                	jne    a4 <t+0xa4>
    
    to:
    
      b3:	89 c2                	mov    %eax,%edx
      b5:	48 89 44 24 60       	mov    %rax,0x60(%rsp)
      ba:	83 e2 fd             	and    $0xfffffffd,%edx
      bd:	89 4c 24 64          	mov    %ecx,0x64(%rsp)
      c1:	88 54 24 60          	mov    %dl,0x60(%rsp)
      c5:	c6 44 24 62 f2       	movb   $0xf2,0x62(%rsp)
      ca:	48 8b 54 24 60       	mov    0x60(%rsp),%rdx
      cf:	f0 48 0f b1 13       	lock cmpxchg %rdx,(%rbx)
      d4:	75 d5                	jne    ab <t+0xab>
    
    where a move and a compare after cmpxchg is saved.  The improvements
    for 32-bit targets are even more noticeable, because dual-word compare
    after cmpxchg8b gets eliminated.
    Signed-off-by: default avatarUros Bizjak <ubizjak@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/20220515184205.103089-3-ubizjak@gmail.com
    c2df0a6a
cmpxchg_32.h 3.67 KB