Bug #48541 Deadlock between LOCK_open and LOCK_mdl
The reason for the deadlock was an improper exit from MDL_context::wait_for_locks() which caused mysys_var->current_mutex to remain LOCK_mdl even though LOCK_mdl was no longer held by that connection. This could for example lead to a deadlock in the following way: 1) INSERT DELAYED tries to open a table but fails, and trying to recover it calls wait_for_locks(). 2) Due to a pending exclusive request, wait_for_locks() fails and exits without resetting mysys_var->current_mutex for the delayed insert handler thread. So it continues to point to LOCK_mdl. 3) The handler thread manages to open a table. 4) A different connection takes LOCK_open and tries to take LOCK_mdl. 5) FLUSH TABLES from a third connection notices that the handler thread has a table open, and tries to kill it. This involves locking mysys_var->current_mutex while having LOCK_open locked. Since current_mutex mistakenly points to LOCK_mdl, we have a deadlock. This patch makes sure MDL_EXIT_COND() is called before exiting wait_for_locks(). This clears mysys->current_mutex which resolves the issue. An assert is added to recover_from_failed_open_table_attempt() after wait_for_locks() is called, to check that current_mutex is indeed reset. With this assert in place, existing tests in (e.g.) mdl_sync.test will fail without this patch.
Showing
Please register or sign in to comment