• Brandon Nesterenko's avatar
    MDEV-27512: Assertion !thd->transaction_rollback_request failed in rows_event_stmt_cleanup · 0ad52e4d
    Brandon Nesterenko authored
    If replicating an event in ROW format, and InnoDB detects a deadlock
    while searching for a row, the row event will error and rollback in
    InnoDB and indicate that the binlog cache also needs to be cleared,
    i.e. by marking thd->transaction_rollback_request. In the normal
    case, this will trigger an error in Rows_log_event::do_apply_event()
    and cause a rollback. During the Rows_log_event::do_apply_event()
    cleanup of a successful event application, there is a DBUG_ASSERT in
    log_event_server.cc::rows_event_stmt_cleanup(), which sets the
    expectation that thd->transaction_rollback_request cannot be set
    because the general rollback (i.e. not the InnoDB rollback) should
    have happened already. However, if the replica is configured to skip
    deadlock errors, the rows event logic will clear the error and
    continue on, as if no error happened. This results in
    thd->transaction_rollback_request being set while in
    rows_event_stmt_cleanup(), thereby triggering the assertion.
    
    This patch fixes this in the following ways:
     1) The assertion is invalid, and thereby removed.
     2) The rollback case is forced in rows_event_stmt_cleanup() if
    transaction_rollback_request is set.
    
    Note the differing behavior between transactions which are skipped
    due to deadlock errors and other errors. When a transaction is
    skipped due to an ignored deadlock error, the entire transaction is
    rolled back and skipped (though note MDEV-33930 which allows
    statements in the same transaction after the deadlock-inducing one
    to commit). When a transaction is skipped due to ignoring a
    different error, only the erroring statements are rolled-back and
    skipped - the rest of the transaction will execute as normal. The
    effect of this can be seen in the test results. The added test case
    to rpl_skip_error.test shows that only statements which are ignored
    due to non-deadlock errors are ignored in larger transactions. A
    diff between rpl_temporary_error2_skip_all.result and
    rpl_temporary_error2.result shows that all statements in the errored
    transaction are rolled back (diff pasted below):
    
    : diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result
    49c49
    < 2	1
    ---
    > 2	NULL
    51c51
    < 4	1
    ---
    > 4	NULL
    53c53
    < * There will be two rows in t2 due to the retry.
    ---
    > * There will be one row in t2 because the ignored deadlock does not retry.
    57d56
    < 1
    59c58
    < 1
    ---
    > 0
    
    Reviewed By:
    ============
    Andrei Elkin <andrei.elkin@mariadb.com>
    0ad52e4d
log_event_server.cc 277 KB