• Jan Lindström's avatar
    MDEV-21910 : KIlling thread on Galera could cause mutex deadlock · a8d75cd0
    Jan Lindström authored
    Whenever Galera BF (brute force) transaction decides to abort conflicting
    transaction it will kill that thread using thd::awake()
    
    User KILL [QUERY|CONNECTION] ... for a thread it will also call thd::awake()
    
    Whenever one of these actions is executed we will hold number of InnoDB
    internal mutexes and thd mutexes. Sometimes these mutexes are taken in
    different order causing mutex deadlock.
    
    Lets call BF kill as bf_thread and user KILL-query as kill_thread.
    
    bf_thread takes mutexes in order:
    (1) lock_sys->mutex (lock0lock.cc lock_rec_other_has_conflicting)
    (2) victim_trx->mutex (lock0lock.cc lock_rec_other_has_conflicting)
    (3) victim_thread->LOCK_thd_data (handler.cc wsrep_innobase_kill_one_trx)
    
    kill_thread takes mutexes in order:
    (1) victim_thread->LOCK_thd_data (sql_parse.cc find_thread_by_id)
    (2) lock_sys->mutex (ha_innodb.cc innobase_kill_query)
    (3) victim_trx->mutex (ha_innodb.cc innobase_kill_query)
    
    Mutex deadlock is result of taking victim_thread->LOCK_thd_data
    in different order.
    
    In this patch we will fix Galera BF victim thread kill so that
    it will not try to lock victim_thread->LOCK_thd_data mutex
    while we hold InnoDB mutexes. Instead victim is inserted a list
    for later kill processing.
    
    A new background thread will pick victim thread from this new list and uses
    thd::awake() with no InnoDB mutexes. Idea is similar to replication background
    kill. This fix enforces that we take mutexes in same order:
    (1) victim_thread->LOCK_thd_data
    (2) lock_sys->mutex ->
    (3) victim_trx->mutex
    
    wsrep_mysqld.cc
    	Here we introduce a list where victim threads are stored,
    	condition variable to be used to wake up background thread
    	and mutex to protect list.
    
    wsrep_thd.cc
    	Create a new background thread to handle victim thread
    	abort. We may take victim_thread->LOCK_thd_data mutex
    	here but not any InnoDB mutexes.
    
    wsrep_innobase_kill_one_trx
    	Remove all the wsrep code that was moved to wsrep_thd.cc
    	We just enqueue required information to background kill
    	list and cancel victim trx lock wait if there is such.
    	Here we have InnoDB lock_sys->mutex and victim_trx->mutex
    	so here we can't take victim_thread->LOCK_thd_data mutex.
    
    wsrep_abort_transaction
    	Cleanup only.
    a8d75cd0
variables.result 7.4 KB