• Marko Mäkelä's avatar
    MDEV-31767 InnoDB tables are being flagged as corrupted on an I/O bound server · b102872a
    Marko Mäkelä authored
    The main problem is that at ever since
    commit aaef2e1d removed the
    function buf_wait_for_read(), it is not safe to invoke
    buf_page_get_low() with RW_NO_LATCH, that is, only buffer-fixing
    the page. If a page read (or decryption or decompression) is in
    progress, there would be a race condition when executing consistency
    checks, and a page would wrongly be flagged as corrupted.
    
    Furthermore, if the page is actually corrupted and the initial
    access to it was with RW_NO_LATCH (only buffer-fixing), the
    page read handler would likely end up in an infinite loop in
    buf_pool_t::corrupted_evict(). It is not safe to invoke
    mtr_t::upgrade_buffer_fix() on a block on which a page latch
    was not initially acquired in buf_page_get_low().
    
    btr_block_reget(): Remove the constant parameter rw_latch=RW_X_LATCH.
    
    btr_block_get(): Assert that RW_NO_LATCH is not being used,
    and change the parameter type of rw_latch.
    
    btr_pcur_move_to_next_page(), innobase_table_is_empty(): Adjust for the
    parameter type change of btr_block_get().
    
    btr_root_block_get(): If mode==RW_NO_LATCH, do not check the integrity of
    the page, because it is not safe to do so.
    
    btr_page_alloc_low(), btr_page_free(): If the root page latch is not
    previously held by the mini-transaction, invoke btr_root_block_get()
    again with the proper latching mode.
    
    btr_latch_prev(): Helper function to safely acquire a latch on a
    preceding sibling page while holding a latch on a B-tree page.
    To avoid deadlocks, we must not wait for the latch while holding
    a latch on the current page, because another thread may be waiting
    for our page latch when moving to the next page from our preceding
    sibling page. If s_lock_try() or x_lock_try() on the preceding page fails,
    we must release the current page latch, and wait for the latch on the
    preceding page as well as the current page, in that order.
    Page splits or merges will be prevented by the parent page latch
    that we are holding.
    
    btr_cur_t::search_leaf(): Make use of btr_latch_prev().
    
    btr_cur_t::open_leaf(): Make use of btr_latch_prev(). Do not invoke
    mtr_t::upgrade_buffer_fix() (when latch_mode == BTR_MODIFY_TREE),
    because we will already have acquired all page latches upfront.
    
    btr_cur_t::pessimistic_search_leaf(): Do acquire an exclusive index latch
    before accessing the page. Make use of btr_latch_prev().
    b102872a
btr0cur.cc 219 KB