• mariadb-DebarunBanerjee's avatar
    MDEV-34458 wait_for_read in buf_page_get_low hurts performance · 73ad436e
    mariadb-DebarunBanerjee authored
    The performance regression seen while loading BP is caused by the
    deadlock fix given in MDEV-33543. The area of impact is wider but is
    more visible when BP is being loaded initially via DMLs.  Specifically
    the response time could be impacted in DML doing pessimistic operation
    on index(split/merge) and the leaf pages are not found in buffer pool.
    It is more likely to occur with small BP size.
    
    The origin of the issue dates back to MDEV-30400 that introduced
    btr_cur_t::search_leaf() replacing btr_cur_search_to_nth_level() for
    leaf page searches. In btr_latch_prev, we use RW_NO_LATCH to get the
    previous page fixed in BP without latching. When the page is not in BP,
    we try to acquire and wait for S latch violating the latching order.
    
    This deadlock was analyzed in MDEV-33543 and fixed by using the already
    present wait logic in buf_page_get_gen() instead of waiting for latch.
    The wait logic is inferior to usual S latch wait and is simply a
    repeated sleep 100 of micro-sec (The actual sleep time could be more
    depending on platforms). The bug was seen with "change-buffering" code
    path and the idea was that this path should be less exercised. The
    judgement was not correct and the path is actually quite frequent and
    does impact performance when pages are not in BP and being loaded by
    DML expanding/shrinking large data.
    
    Fix: While trying to get a page with RW_NO_LATCH and we are attempting
    "out of order" latch, return from buf_page_get_gen immediately instead
    of waiting and follow the ordered latching path.
    73ad436e
buf0buf.h 71.7 KB