• Wander Lairson Costa's avatar
    rtmutex: Ensure that the top waiter is always woken up · db370a8b
    Wander Lairson Costa authored
    Let L1 and L2 be two spinlocks.
    
    Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top
    waiter of L2.
    
    Let T2 be the task holding L2.
    
    Let T3 be a task trying to acquire L1.
    
    The following events will lead to a state in which the wait queue of L2
    isn't empty, but no task actually holds the lock.
    
    T1                T2                                  T3
    ==                ==                                  ==
    
                                                          spin_lock(L1)
                                                          | raw_spin_lock(L1->wait_lock)
                                                          | rtlock_slowlock_locked(L1)
                                                          | | task_blocks_on_rt_mutex(L1, T3)
                                                          | | | orig_waiter->lock = L1
                                                          | | | orig_waiter->task = T3
                                                          | | | raw_spin_unlock(L1->wait_lock)
                                                          | | | rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3)
                      spin_unlock(L2)                     | | | |
                      | rt_mutex_slowunlock(L2)           | | | |
                      | | raw_spin_lock(L2->wait_lock)    | | | |
                      | | wakeup(T1)                      | | | |
                      | | raw_spin_unlock(L2->wait_lock)  | | | |
                                                          | | | | waiter = T1->pi_blocked_on
                                                          | | | | waiter == rt_mutex_top_waiter(L2)
                                                          | | | | waiter->task == T1
                                                          | | | | raw_spin_lock(L2->wait_lock)
                                                          | | | | dequeue(L2, waiter)
                                                          | | | | update_prio(waiter, T1)
                                                          | | | | enqueue(L2, waiter)
                                                          | | | | waiter != rt_mutex_top_waiter(L2)
                                                          | | | | L2->owner == NULL
                                                          | | | | wakeup(T1)
                                                          | | | | raw_spin_unlock(L2->wait_lock)
    T1 wakes up
    T1 != top_waiter(L2)
    schedule_rtlock()
    
    If the deadline of T1 is updated before the call to update_prio(), and the
    new deadline is greater than the deadline of the second top waiter, then
    after the requeue, T1 is no longer the top waiter, and the wrong task is
    woken up which will then go back to sleep because it is not the top waiter.
    
    This can be reproduced in PREEMPT_RT with stress-ng:
    
    while true; do
        stress-ng --sched deadline --sched-period 1000000000 \
        	    --sched-runtime 800000000 --sched-deadline \
        	    1000000000 --mmapfork 23 -t 20
    done
    
    A similar issue was pointed out by Thomas versus the cases where the top
    waiter drops out early due to a signal or timeout, which is a general issue
    for all regular rtmutex use cases, e.g. futex.
    
    The problematic code is in rt_mutex_adjust_prio_chain():
    
        	// Save the top waiter before dequeue/enqueue
    	prerequeue_top_waiter = rt_mutex_top_waiter(lock);
    
    	rt_mutex_dequeue(lock, waiter);
    	waiter_update_prio(waiter, task);
    	rt_mutex_enqueue(lock, waiter);
    
    	// Lock has no owner?
    	if (!rt_mutex_owner(lock)) {
    	   	// Top waiter changed		      			   
      ---->		if (prerequeue_top_waiter != rt_mutex_top_waiter(lock))
      ---->			wake_up_state(waiter->task, waiter->wake_state);
    
    This only takes the case into account where @waiter is the new top waiter
    due to the requeue operation.
    
    But it fails to handle the case where @waiter is not longer the top
    waiter due to the requeue operation.
    
    Ensure that the new top waiter is woken up so in all cases so it can take
    over the ownerless lock.
    
    [ tglx: Amend changelog, add Fixes tag ]
    
    Fixes: c014ef69 ("locking/rtmutex: Add wake_state to rt_mutex_waiter")
    Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
    Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com
    Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com
    db370a8b
rtmutex.c 48 KB