MDEV-33363 CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn >...

MDEV-33363 CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn This regression is introduced in MDEV-28708 where the MTR_LOG_NO_REDO mtrs are assigned last_checkpoint_lsn as the LSN. It causes a race with checkpoint in pending state. The concurrent checkpoint writes a checkpoint LSN of larger value after pages with older checkpoint LSN is inserted into the flush list. The next checkpoint sees the reversal in checkpoint sequence and asserts if the pages are not yet flushed. There could be several ways to solve this issue. Ideally the unlogged mtr should take the latest LSN as opposed to going behind and use the previous checkpoint LSN. It has been the older design and seems good. Also, other than the critical race, using the old checkpoint LSN adds the pages to other end of flush list overriding all existing dirty pages and looks counter intuitive.

MDEV-33363 CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn >...
MDEV-33363 CI failure: innodb.import_corrupted: Assertion failed: oldest_lsn > log_sys.last_checkpoint_lsn This regression is introduced in MDEV-28708 where the MTR_LOG_NO_REDO mtrs are assigned last_checkpoint_lsn as the LSN. It causes a race with checkpoint in pending state. The concurrent checkpoint writes a checkpoint LSN of larger value after pages with older checkpoint LSN is inserted into the flush list. The next checkpoint sees the reversal in checkpoint sequence and asserts if the pages are not yet flushed. There could be several ways to solve this issue. Ideally the unlogged mtr should take the latest LSN as opposed to going behind and use the previous checkpoint LSN. It has been the older design and seems good. Also, other than the critical race, using the old checkpoint LSN adds the pages to other end of flush list overriding all existing dirty pages and looks counter intuitive.
4039d860 · mariadb-DebarunBanerjee · 64cce8d5 · 4039d860
Commit 4039d860 authored Feb 16, 2024 by mariadb-DebarunBanerjee
Hide whitespace changes
Inline Side-by-side

Showing with 8 additions and 1 deletion

storage/innobase/mtr/mtr0mtr.cc storage/innobase/mtr/mtr0mtr.cc +8 -1

No files found.
--- a/storage/innobase/mtr/mtr0mtr.cc
+++ b/storage/innobase/mtr/mtr0mtr.cc
@@ -234,7 +234,14 @@ static void insert_imported(buf_block_t *block)
  if (block->page.oldest_modification() <= 1)
  {
    log_sys.latch.rd_lock(SRW_LOCK_CALL);
-    const lsn_t lsn= log_sys.last_checkpoint_lsn;
+    /* For unlogged mtrs (MTR_LOG_NO_REDO), we use the current system LSN. The
+    mtr that generated the LSN is either already committed or in mtr_t::commit.
+    Shared latch and relaxed atomics should be fine here as it is guaranteed
+    that both the current mtr and the mtr that generated the LSN would have
+    added the dirty pages to flush list before we access the minimum LSN during
+    checkpoint. log_checkpoint_low() acquires exclusive log_sys.latch before
+    commencing. */
+    const lsn_t lsn= log_sys.get_lsn();
    mysql_mutex_lock(&buf_pool.flush_list_mutex);
    buf_pool.insert_into_flush_list
      (buf_pool.prepare_insert_into_flush_list(lsn), block, lsn);