MDEV-33819 The purge of committed history is mis-parsing some log
In commit aa719b50 (part of MDEV-32050) a bug was introduced in the function purge_sys_t::choose_next_log(), which reimplements some logic that previously was part of trx_purge_read_undo_rec(). We must invoke trx_undo_get_first_rec() with the page number and offset of the undo log header, but we were incorrectly invoking it on the current undo page number, which caused us to parse undo records starting at an incorrect offset. purge_sys_t::choose_next_log(): Pass the correct parameter to trx_undo_page_get_first_rec(). trx_undo_page_get_next_rec(), trx_undo_page_get_first_rec(), trx_undo_page_get_last_rec(): Add debug assertions and make the code more robust by returning nullptr on corruption. Should we detect any corrupted undo logs during the purge of committed transaction history, the sanest thing to do is to pretend that the end of an undo log was reached. If any garbage is left in the tables, it will be ignored by anything else than CHECK TABLE ... EXTENDED, and it can be removed by OPTIMIZE TABLE. Thanks to Matthias Leich for providing an "rr replay" trace where this bug could be found. Reviewed by: Vladislav Lesin
Showing
Please register or sign in to comment