MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates

Before doing mark_start_commit(), check that there is no pending deadlock kill. If there is a pending kill, we won't commit (we will abort, roll back, and retry). Then we should not mark the commit as started, since that could potentially make the following GCO start too early, before we completed the commit after the retry. This condition could trigger in some corner cases, where InnoDB would take temporarily table/row locks that are released again immediately, not held until the transaction commits. This happens with dict_stats updates and possibly auto-increment locks. Such locks can be passed to thd_rpl_deadlock_check() and cause a deadlock kill to be scheduled in the background. But since the blocking locks are held only temporarily, they can be released before the background kill happens. This way, the kill can be delayed until after mark_start_commit() has been called. Thus we need to check the synchronous indication rgi->killed_for_retry, not just the asynchroneous thd->killed. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

MDEV-34696: do_gco_wait() completes too early on InnoDB dict stats updates
Before doing mark_start_commit(), check that there is no pending deadlock kill. If there is a pending kill, we won't commit (we will abort, roll back, and retry). Then we should not mark the commit as started, since that could potentially make the following GCO start too early, before we completed the commit after the retry. This condition could trigger in some corner cases, where InnoDB would take temporarily table/row locks that are released again immediately, not held until the transaction commits. This happens with dict_stats updates and possibly auto-increment locks. Such locks can be passed to thd_rpl_deadlock_check() and cause a deadlock kill to be scheduled in the background. But since the blocking locks are held only temporarily, they can be released before the background kill happens. This way, the kill can be delayed until after mark_start_commit() has been called. Thus we need to check the synchronous indication rgi->killed_for_retry, not just the asynchroneous thd->killed. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
b4c2e239 · Kristian Nielsen · 1f040ae0 · b4c2e239 · b4c2e239
Commit b4c2e239 authored Aug 03, 2024 by Kristian Nielsen
Hide whitespace changes
Inline Side-by-side

Showing with 33 additions and 4 deletions

sql/rpl_parallel.cc sql/rpl_parallel.cc +16 -4

sql/rpl_rli.cc sql/rpl_rli.cc +17 -0

No files found.
--- a/sql/rpl_parallel.cc
+++ b/sql/rpl_parallel.cc
@@ -1450,11 +1450,23 @@ handle_rpl_parallel_thread(void *arg)
          after mark_start_commit(), we have to unmark, which has at least a
          theoretical possibility of leaving a window where it looks like all
          transactions in a GCO have started committing, while in fact one
-          will need to rollback and retry. This is not supposed to be possible
-          (since there is a deadlock, at least one transaction should be
-          blocked from reaching commit), but this seems a fragile ensurance,
-          and there were historically a number of subtle bugs in this area.
+          will need to rollback and retry.
+
+          Normally this will not happen, since the kill is there to resolve a
+          deadlock that is preventing at least one transaction from proceeding.
+          One case it can happen is with InnoDB dict stats update, which can
+          temporarily cause transactions to block each other, but locks are
+          released immediately, they don't linger until commit. There could be
+          other similar cases, there were historically a number of subtle bugs
+          in this area.
+
+          But once we start the commit, we can expect that no new lock
+          conflicts will be introduced. So by handling any lingering deadlock
+          kill at this point just before mark_start_commit(), we should be
+          robust even towards spurious deadlock kills.
        */
+        if (rgi->killed_for_retry != rpl_group_info::RETRY_KILL_NONE)
+          wait_for_pending_deadlock_kill(thd, rgi);
        if (!thd->killed)
        {
          DEBUG_SYNC(thd, "rpl_parallel_before_mark_start_commit");

--- a/sql/rpl_rli.cc
+++ b/sql/rpl_rli.cc
@@ -2518,6 +2518,23 @@ rpl_group_info::unmark_start_commit()

  e= this->parallel_entry;
  mysql_mutex_lock(&e->LOCK_parallel_entry);
+  /*
+    Assert that we have not already wrongly completed this GCO and signalled
+    the next one to start, only to now unmark and make the signal invalid.
+    This is to catch problems like MDEV-34696.
+
+    The error inject rpl_parallel_simulate_temp_err_xid is used to test this
+    precise situation, that we handle it gracefully if it somehow occurs in a
+    release build. So disable the assert in this case.
+  */
+#ifndef DBUG_OFF
+  bool allow_unmark_after_complete= false;
+  DBUG_EXECUTE_IF("rpl_parallel_simulate_temp_err_xid",
+                  allow_unmark_after_complete= true;);
+  DBUG_ASSERT(!gco->next_gco ||
+              gco->next_gco->wait_count > e->count_committing_event_groups ||
+              allow_unmark_after_complete);
+#endif
  --e->count_committing_event_groups;
  mysql_mutex_unlock(&e->LOCK_parallel_entry);
 }