MDEV-31767 InnoDB tables are being flagged as corrupted on an I/O bound server
The main problem is that at ever since commit aaef2e1d removed the function buf_wait_for_read(), it is not safe to invoke buf_page_get_low() with RW_NO_LATCH, that is, only buffer-fixing the page. If a page read (or decryption or decompression) is in progress, there would be a race condition when executing consistency checks, and a page would wrongly be flagged as corrupted. Furthermore, if the page is actually corrupted and the initial access to it was with RW_NO_LATCH (only buffer-fixing), the page read handler would likely end up in an infinite loop in buf_pool_t::corrupted_evict(). It is not safe to invoke mtr_t::upgrade_buffer_fix() on a block on which a page latch was not initially acquired in buf_page_get_low(). btr_block_reget(): Remove the constant parameter rw_latch=RW_X_LATCH. btr_block_get(): Assert that RW_NO_LATCH is not being used, and change the parameter type of rw_latch. btr_pcur_move_to_next_page(), innobase_table_is_empty(): Adjust for the parameter type change of btr_block_get(). btr_root_block_get(): If mode==RW_NO_LATCH, do not check the integrity of the page, because it is not safe to do so. btr_page_alloc_low(), btr_page_free(): If the root page latch is not previously held by the mini-transaction, invoke btr_root_block_get() again with the proper latching mode. btr_latch_prev(): Helper function to safely acquire a latch on a preceding sibling page while holding a latch on a B-tree page. To avoid deadlocks, we must not wait for the latch while holding a latch on the current page, because another thread may be waiting for our page latch when moving to the next page from our preceding sibling page. If s_lock_try() or x_lock_try() on the preceding page fails, we must release the current page latch, and wait for the latch on the preceding page as well as the current page, in that order. Page splits or merges will be prevented by the parent page latch that we are holding. btr_cur_t::search_leaf(): Make use of btr_latch_prev(). btr_cur_t::open_leaf(): Make use of btr_latch_prev(). Do not invoke mtr_t::upgrade_buffer_fix() (when latch_mode == BTR_MODIFY_TREE), because we will already have acquired all page latches upfront. btr_cur_t::pessimistic_search_leaf(): Do acquire an exclusive index latch before accessing the page. Make use of btr_latch_prev().
Showing
Please register or sign in to comment