MDEV-33551: Semi-sync Wait Point AFTER_COMMIT Slow on Workloads with Heavy Concurrency
When using semi-sync replication with rpl_semi_sync_master_wait_point=AFTER_COMMIT, the performance of the primary can significantly reduce compared to AFTER_SYNC's performance for workloads with many concurrent users executing transactions. This is because all connections on the primary share the same cond_wait variable/mutex pair, so any time an ACK is received from a replica, all waiting connections are awoken to check if the ACK was for itself, which is done in mutual exclusion. This patch changes this such that the waiting THD will use its own local condition variable, and the ACK receiver thread only signals connections which have been ACKed for wakeup. That is, the THD::LOCK_wakeup_ready condition variable is re-used for this purpose, and the Active_tranx queue nodes are extended to hold the waiting thread, so it can be signalled once ACKed. Additionally: 1) Removed part of MDEV-11853 additions, which allowed suspended connection threads awaiting their semi-sync ACKs to live until their ACKs had been received. This part, however, wasn't needed. That is, all that was needed was for the Ack_thread to survive. So now the connection threads are killed during phase 1. Thereby THD::is_awaiting_semisync_ack, and all its related code was removed. 2) COND_binlog_send is repurposed to signal on the condition when Active_tranx is emptied during clear_active_tranx_nodes. 3) At master shutdown (when waiting for slaves), instead of the main loop individually waiting for each ACK, await_slave_reply() (renamed await_all_slave_replies()) just waits once for the repurposed COND_binlog_send to signal it is empty. 4) Test rpl_semi_sync_shutdown_await_ack is updates as following: 4.1) Added test case (adapted from Kristian Nielsen) to ensure that if a thread awaiting its ACK is killed while SHUTDOWN WAIT FOR ALL SLAVES is issued, the primary will still wait for the ACK from the killed thread. 4.2) As connections which by-passed phase 1 of thread killing no longer are delayed for kill until phase 2, we can no longer query yes/no tx after receiving an ACK/timeout. The check for these variables is removed. 4.3) Comment descriptions are updated which mention that the connection is alive; and adjusted to be the Ack_thread. Reviewed By: ============ Kristian Nielsen <knielsen@knielsen-hq.org>
Showing
Please register or sign in to comment