• Peter Zijlstra's avatar
    locking/rtmutex: Fix task->pi_waiters integrity · f7853c34
    Peter Zijlstra authored
    Henry reported that rt_mutex_adjust_prio_check() has an ordering
    problem and puts the lie to the comment in [7]. Sharing the sort key
    between lock->waiters and owner->pi_waiters *does* create problems,
    since unlike what the comment claims, holding [L] is insufficient.
    
    Notably, consider:
    
    	A
          /   \
         M1   M2
         |     |
         B     C
    
    That is, task A owns both M1 and M2, B and C block on them. In this
    case a concurrent chain walk (B & C) will modify their resp. sort keys
    in [7] while holding M1->wait_lock and M2->wait_lock. So holding [L]
    is meaningless, they're different Ls.
    
    This then gives rise to a race condition between [7] and [11], where
    the requeue of pi_waiters will observe an inconsistent tree order.
    
    	B				C
    
      (holds M1->wait_lock,		(holds M2->wait_lock,
       holds B->pi_lock)		 holds A->pi_lock)
    
      [7]
      waiter_update_prio();
      ...
      [8]
      raw_spin_unlock(B->pi_lock);
      ...
      [10]
      raw_spin_lock(A->pi_lock);
    
    				[11]
    				rt_mutex_enqueue_pi();
    				// observes inconsistent A->pi_waiters
    				// tree order
    
    Fixing this means either extending the range of the owner lock from
    [10-13] to [6-13], with the immediate problem that this means [6-8]
    hold both blocked and owner locks, or duplicating the sort key.
    
    Since the locking in chain walk is horrible enough without having to
    consider pi_lock nesting rules, duplicate the sort key instead.
    
    By giving each tree their own sort key, the above race becomes
    harmless, if C sees B at the old location, then B will correct things
    (if they need correcting) when it walks up the chain and reaches A.
    
    Fixes: fb00aca4 ("rtmutex: Turn the plist into an rb-tree")
    Reported-by: default avatarHenry Wu <triangletrap12@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Tested-by: default avatarHenry Wu <triangletrap12@gmail.com>
    Link: https://lkml.kernel.org/r/20230707161052.GF2883469%40hirez.programming.kicks-ass.net
    f7853c34
ww_mutex.h 13.7 KB