• Andrei's avatar
    MDEV-29621: Replica stopped by locks on sequence · 55a53949
    Andrei authored
    When using binlog_row_image=FULL with sequence table inserts, a
    replica can deadlock because it treats full inserts in a sequence as DDL
    statements by getting an exclusive lock on the sequence table. It
    has been observed that with parallel replication, this exclusive
    lock on the sequence table can lead to a deadlock where one
    transaction has the exclusive lock and is waiting on a prior
    transaction to commit, whereas this prior transaction is waiting on
    the MDL lock.
    
    This fix for this is on the master side, to raise FL_DDL
    flag on the GTID of a full binlog_row_image write of a sequence table.
    This forces the slave to execute the statement serially so a deadlock
    cannot happen.
    
    A test verifies the deadlock also to prove it happen on the OLD (pre-fixes)
    slave.
    
    OLD (buggy master) -replication-> NEW (fixed slave) is provided.
    As the pre-fixes master's full row-image may represent both
    SELECT NEXT VALUE and INSERT, the parallel slave pessimistically
    waits for the prior transaction to have committed before to take on the
    critical part of the second (like INSERT in the test) event execution.
    The waiting exploits a parallel slave's retry mechanism which is
    controlled by `@@global.slave_transaction_retries`.
    
    Note that in order to avoid any persistent 'Deadlock found' 2013 error
    in OLD -> NEW, `slave_transaction_retries` may need to be set to a
    higher than the default value.
    START-SLAVE is an effective work-around if this still happens.
    55a53949
rpl_parallel.cc 91.2 KB