• Marko Mäkelä's avatar
    MDEV-24391 heap-use-after-free in fil_space_t::flush_low() · 8677c14e
    Marko Mäkelä authored
    We observed a race condition that involved two threads
    executing fil_flush_file_spaces() and one thread
    executing fil_delete_tablespace(). After one of the
    fil_flush_file_spaces() observed that
    space.needs_flush_not_stopping() is set and was
    releasing the fil_system.mutex, the other fil_flush_file_spaces()
    would complete the execution of fil_space_t::flush_low() on
    the same tablespace. Then, fil_delete_tablespace() would
    destroy the object, because the value of fil_space_t::n_pending
    did not prevent that. Finally, the fil_flush_file_spaces() would
    resume execution and invoke fil_space_t::flush_low() on the freed
    object.
    
    This race condition was introduced in
    commit 118e258a of MDEV-23855.
    
    fil_space_t::flush(): Add a template parameter that indicates
    whether the caller is holding a reference to prevent the
    tablespace from being freed.
    
    buf_dblwr_t::flush_buffered_writes_completed(),
    row_quiesce_table_start(): Acquire a reference for the duration
    of the fil_space_t::flush_low() operation. It should be impossible
    for the object to be freed in these code paths, but we want to
    satisfy the debug assertions.
    
    fil_space_t::flush_low(): Do not increment or decrement the
    reference count, but instead assert that the caller is holding
    a reference.
    
    fil_space_extend_must_retry(), fil_flush_file_spaces():
    Acquire a reference before releasing fil_system.mutex.
    This is what will fix the race condition.
    8677c14e
row0quiesce.cc 18.2 KB