• Marko Mäkelä's avatar
    MDEV-22871: Reduce InnoDB buf_pool.page_hash contention · 5155a300
    Marko Mäkelä authored
    The rw_lock_s_lock() calls for the buf_pool.page_hash became a
    clear bottleneck after MDEV-15053 reduced the contention on
    buf_pool.mutex. We will replace that use of rw_lock_t with a
    special implementation that is optimized for memory bus traffic.
    
    The hash_table_locks instrumentation will be removed.
    
    buf_pool_t::page_hash: Use a special implementation whose API is
    compatible with hash_table_t, and store the custom rw-locks
    directly in buf_pool.page_hash.array, intentionally sharing
    cache lines with the hash table pointers.
    
    rw_lock: A low-level rw-lock implementation based on std::atomic<uint32_t>
    where read_trylock() becomes a simple fetch_add(1).
    
    buf_pool_t::page_hash_latch: The special of rw_lock for the page_hash.
    
    buf_pool_t::page_hash_latch::read_lock(): Assert that buf_pool.mutex
    is not being held by the caller.
    
    buf_pool_t::page_hash_latch::write_lock() may be called while not holding
    buf_pool.mutex. buf_pool_t::watch_set() is such a caller.
    
    buf_pool_t::page_hash_latch::read_lock_wait(),
    page_hash_latch::write_lock_wait(): The spin loops.
    These will obey the global parameters innodb_sync_spin_loops and
    innodb_sync_spin_wait_delay.
    
    buf_pool_t::freed_page_hash: A singly linked list of copies of
    buf_pool.page_hash that ever existed. The fact that we never
    free any buf_pool.page_hash.array guarantees that all
    page_hash_latch that ever existed will remain valid until shutdown.
    
    buf_pool_t::resize_hash(): Replaces buf_pool_resize_hash().
    Prepend a shallow copy of the old page_hash to freed_page_hash.
    
    buf_pool_t::page_hash_table::n_cells: Declare as Atomic_relaxed.
    
    buf_pool_t::page_hash_table::lock(): Explain what prevents a
    race condition with buf_pool_t::resize_hash().
    5155a300
btr0sea.cc 60.5 KB