• Marko Mäkelä's avatar
    MDEV-34830: LSN in the future is not being treated as serious corruption · 1e25202a
    Marko Mäkelä authored
    The invariant of write-ahead logging is that before any change to a
    page is written to the data file, the corresponding log record must
    must first have been durably written.
    
    On crash recovery, there were some sloppy checks for this. Let us
    implement accurate checks and flag an inconsistency as a hard error,
    so that we can avoid further corruption of a corrupted database.
    For data extraction from the corrupted database, innodb_force_recovery=6
    can be used.
    
    A section of the test mariabackup.innodb_redo_overwrite
    that is parsing some mariadb-backup --backup output has
    been removed, because that output "redo log block is overwritten"
    would often be missing in a Microsoft Windows environment
    as a result of these changes.
    
    recv_sys_t::max_page_lsn: Replaces recv_max_page_lsn.
    
    recv_sys_t::early_batch: Whether apply(false) is executing.
    Before the final recovery batch, we will not have read the
    log records until the end and therefore will not know the final LSN.
    
    recv_lsn_checks_on: Remove.
    
    recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
    condition at the end of the recovery. This includes validating
    max_page_lsn in case a multi-batch recovery was executed.
    
    recv_dblwr_t::validate_page(): Keep track of the maximum LSN
    (if we are checking a non-doublewrite copy of a page) but
    do not complain LSN being in the future. The doublewrite buffer
    is a special case, because it will be read early during recovery.
    Besides, starting with commit 762bcb81
    the dblwr=true copies of pages may legitimately be "too new".
    
    recv_sys_t::check_page_lsn(): Validate FIL_PAGE_LSN during recovery.
    Update max_page_lsn if needed. Do not flag an error if early_batch.
    
    recv_dblwr_t::find_page(): Find a valid page with the smallest
    FIL_PAGE_LSN that is large enough for recovery. Invoke
    recv_sys_t::check_page_lsn() on the chosen LSN so that
    "LSN in the future" can be flagged.
    
    recv_dblwr_t::restore_first_page(): Require the recv_sys.mutex
    to be held by the caller, and return an error code.
    
    buf_dblwr_t::recover(): Simplify the message output. Do attempt
    doublewrite recovery on user page read error. Ignore doublewrite
    pages whose FIL_PAGE_LSN is outside the usable bounds.
    
    buf_page_is_corrupted(): Distinguish the return values
    CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.
    
    buf_page_check_corrupt(): Return the error code DB_CORRUPTION
    in case the LSN is in the future.
    
    Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
    in the same way on both 32-bit and 64-bit architectures.
    1e25202a
innochecksum.cc 50 KB