1. 21 Apr, 2021 37 commits
  2. 20 Apr, 2021 1 commit
  3. 19 Apr, 2021 2 commits
    • Marko Mäkelä's avatar
      MDEV-25404: ssux_lock_low: Introduce a separate writer mutex · 8751aa73
      Marko Mäkelä authored
      Having both readers and writers use a single lock word in
      futex system calls caused performance regression compared to
      SRW_LOCK_DUMMY (mutex and 2 condition variables).
      A contributing factor is that we did not accurately keep
      track of the number of waiting threads and thus had to invoke
      system calls to wake up any waiting threads.
      
      SUX_LOCK_GENERIC: Renamed from SRW_LOCK_DUMMY. This is the
      original implementation, with rw_lock (std::atomic<uint32_t>),
      a mutex and two condition variables. Using a separate writer
      mutex (as described below) is not possible, because the mutex ownership
      in a buf_block_t::lock must be able to transfer from a write submitter
      thread to an I/O completion thread, and pthread_mutex_lock() may assume
      that the submitter thread is recursively acquiring the mutex that it
      already holds, while in reality the I/O completion thread is the real
      owner. POSIX does not define an interface for requesting a mutex to
      be non-recursive.
      
      On Microsoft Windows, srw_lock_low will remain a simple wrapper of
      SRWLOCK. On 32-bit Microsoft Windows, sizeof(SRWLOCK)=4 while
      sizeof(srw_lock_low)=8.
      
      On other platforms, srw_lock_low is an alias of ssux_lock_low,
      the Simple (non-recursive) Shared/Update/eXclusive lock.
      
      In the futex-based implementation of ssux_lock_low (Linux, OpenBSD,
      Microsoft Windows), we shall use a dedicated mutex for exclusive
      requests (writer), and have a WRITER flag in the 'readers' lock word
      to inform that a writer is holding the lock or waiting for the lock to
      be granted. When the WRITER flag is set, all lock requests must acquire
      the writer mutex. Normally, shared (S) lock requests simply perform a
      compare-and-swap on the 'readers' word.
      
      Update locks are implemented as a combination of writer mutex
      and a normal counter in the 'readers' lock word. The conflict between
      U and X locks is guaranteed by the writer mutex.
      Unlike SUX_LOCK_GENERIC, wr_u_downgrade() will not wake up any pending
      rd_lock() waits. They will wait until u_unlock() releases the writer mutex.
      
      The ssux_lock_low is always wrapped by sux_lock (with a recursion count
      of U and X locks), used for dict_index_t::lock and buf_block_t::lock.
      Their memory footprint for the futex-based implementation will increase
      by sizeof(srw_mutex), or 4 bytes.
      
      This change addresses a performance regression in read-only benchmarks,
      such as sysbench oltp_read_only. Also write performance was improved.
      
      On 32-bit Linux and OpenBSD, lock_sys_t::hash_table will allocate
      two hash table elements for each srw_lock (14 instead of 15 hash
      table cells per 64-byte cache line on IA-32). On Microsoft Windows,
      sizeof(SRWLOCK)==sizeof(void*) and there is no change.
      
      Reviewed by: Vladislav Vaintroub
      Tested by: Axel Schwenke and Vladislav Vaintroub
      8751aa73
    • Marko Mäkelä's avatar
      MDEV-25404: Optimize srw_mutex on Linux, OpenBSD, Windows · 040c16ab
      Marko Mäkelä authored
      On Linux, OpenBSD and Microsoft Windows, srw_mutex was an alias for a
      rw-lock while we only need mutex functionality. Let us implement a
      futex-based mutex with one bit for HOLDER and 31 bits for counting
      waiting requests.
      
      srw_lock::wr_unlock() can avoid waking up a waiter when no waiting
      requests exist. (Previously, we only had 1-bit rw_lock::WRITER_WAITING
      flag that could be wrongly cleared if multiple waiting wr_lock() exist.
      Now we have no problem with up to 2,147,483,648 conflicting threads.)
      
      On 64-bit Microsoft Windows, the advantage is that
      sizeof(srw_mutex) is 4, while sizeof(SRWLOCK) would be 8.
      
      Reviewed by: Vladislav Vaintroub
      040c16ab