1. 02 Jan, 2021 1 commit
    • Marko Mäkelä's avatar
      MDEV-24514 WITH_MSAN is disabling WOLFSSL_AESNI acceleration · 172ce659
      Marko Mäkelä authored
      WolfSSL is the WITH_SSL=bundled library since MDEV-18351.
      For the AMD64 architecture, the library includes some assembler code.
      That code was disabled in WITH_MSAN builds, because MemorySanitizer
      would consider any data that is computed by uninstrumented code to
      be uninitialized.
      172ce659
  2. 01 Jan, 2021 2 commits
  3. 26 Dec, 2020 1 commit
    • Otto Kekäläinen's avatar
      Travis-CI: Optimize rate of false negatives vs true failures · 139c85aa
      Otto Kekäläinen authored
      Move 'encryption' tests to another job, since the 'binlog' and 'rpl'
      tests are so slow and often make the job timeout (after 50 minutes).
      
      Allow failure in ppc64el as it frequently fails due to out-of-memory.
      A simple restart often fixes it, but we can't depend on restarts.
      
      Also re-enable arm64 as MDEV-23955 is now fixed.
      
      MERGING: This commit is OK to merge 10.6 and upwards.
      139c85aa
  4. 23 Dec, 2020 3 commits
  5. 22 Dec, 2020 1 commit
  6. 21 Dec, 2020 10 commits
  7. 18 Dec, 2020 3 commits
    • Marko Mäkelä's avatar
      MDEV-24445 Using innodb_undo_tablespaces corrupts system tablespace · 0c23e32d
      Marko Mäkelä authored
      In the rewrite of MDEV-8139 (based on MDEV-15528), we introduced a
      wrong assumption that any persistent tablespace that is not an .ibd
      file is the system tablespace. This assumption is broken when
      innodb_undo_tablespaces (files undo001, undo002, ...) are being used.
      By default, we have innodb_undo_tablespaces=0 (the persistent undo
      log is being stored in the system tablespace).
      
      In MDEV-15528 and MDEV-8139 we rewrote the page scrubbing logic
      so that it will follow the tried-and-true write-ahead logging
      protocol, first writing FREE_PAGE records and then in the page
      flushing, zerofilling or hole-punching freed pages.
      
      Unfortunately, the implementation included a wrong assumption that
      that anything that is not in an .ibd file must be the system tablespace.
      This wrong assumption would cause overwrites of valid data pages in
      the system tablespace.
      
      mtr_t::m_freed_in_system_tablespace: Remove.
      
      mtr_t::m_freed_space: The tablespace associated with m_freed_pages.
      
      buf_page_free(): Take the tablespace and page number as a parameter,
      instead of taking a page identifier.
      0c23e32d
    • Marko Mäkelä's avatar
      MDEV-24442 Assertion space->referenced() failed in fil_crypt_space_needs_rotation · cd093d79
      Marko Mäkelä authored
      A race condition between deleting an .ibd file and fil_crypt_thread
      marking pages dirty was introduced in
      commit 118e258a (part of MDEV-23855).
      
      fil_space_t::acquire_if_not_stopped(): Correctly return false
      if the STOPPING flag is set, indicating that any further activity
      on the tablespace must be avoided. Also, remove the constant parameter
      have_mutex=true and move the function declaration to the same
      compilation unit with the only callers.
      
      fil_crypt_flush_space(): Remove an unused variable.
      cd093d79
    • Marko Mäkelä's avatar
      MDEV-24426 fixup: Assertion failure on shutdown · a1974d19
      Marko Mäkelä authored
      fil_crypt_find_space_to_rotate(): Always treat the sentinel value
      that indicates that we have run out of work, even if at the same
      time the thread should shut down due to other reasons.
      
      Thanks to Matthias Leich for reproducing this bug with RQG.
      a1974d19
  8. 17 Dec, 2020 1 commit
    • Marko Mäkelä's avatar
      MDEV-24426 fil_crypt_thread keep spinning even if innodb_encryption_rotate_key_age=0 · 1fe3dd00
      Marko Mäkelä authored
      After MDEV-15528, two modes of operation in the fil_crypt_thread
      remains, depending on whether innodb_encryption_rotate_key_age=0
      (whether key rotation is disabled). If the key rotation is disabled,
      the fil_crypt_thread miss the opportunity to sleep, which will result
      in lots of wasted CPU usage.
      
      fil_crypt_return_iops(): Add a parameter to specify whether other
      fil_crypt_thread should be woken up.
      
      fil_system_t::keyrotate_next(): Return the special value
      fil_system.temp_space to indicate that no work is to be done.
      
      fil_space_t::next(): Propagage the special value fil_system.temp_space
      to the caller.
      
      fil_crypt_find_space_to_rotate(): If no work is to be done,
      do not wake up other threads.
      1fe3dd00
  9. 15 Dec, 2020 12 commits
  10. 14 Dec, 2020 5 commits
    • Stepan Patryshev's avatar
      e4c25895
    • Marko Mäkelä's avatar
      MDEV-24313 fixup: GCC 8 -Wconversion · e8217d07
      Marko Mäkelä authored
      e8217d07
    • Marko Mäkelä's avatar
      MDEV-24313 fixup: GCC -Wparentheses · 2c226e01
      Marko Mäkelä authored
      2c226e01
    • Marko Mäkelä's avatar
      MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1 · f24b7383
      Marko Mäkelä authored
      In commit 5e62b6a5 (MDEV-16264)
      the logic of os_aio_init() was changed so that it will never fail,
      but instead automatically disable innodb_use_native_aio (which is
      enabled by default) if the io_setup() system call would fail due
      to resource limits being exceeded. This is questionable, especially
      because falling back to simulated AIO may lead to significantly
      reduced performance.
      
      srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads:
      Change the data type from ulong to uint.
      
      os_aio_init(): Remove the parameters, and actually return an error code.
      
      thread_pool::configure_aio(): Do not silently fall back to simulated AIO.
      
      Reviewed by: Vladislav Vaintroub
      f24b7383
    • Marko Mäkelä's avatar
      MDEV-24313 (1 of 2): Hang with innodb_write_io_threads=1 · 17d3f856
      Marko Mäkelä authored
      After commit a5a2ef07 (part of MDEV-23855)
      implemented asynchronous doublewrite, it is possible that the server will
      hang when the following parametes are in effect:
      
          innodb_doublewrite=1 (default)
          innodb_write_io_threads=1
          innodb_use_native_aio=0
      
      Note: In commit 5e62b6a5 (MDEV-16264)
      the logic of os_aio_init() was changed so that it will never fail,
      but instead automatically disable innodb_use_native_aio (which is
      enabled by default) if the io_setup() system call would fail due
      to resource limits being exceeded.
      
      Before commit a5a2ef07, we used
      a synchronous write for the doublewrite buffer batches, always at
      most 64 pages at a time. So, upon completing a doublewrite batch,
      a single thread would submit at most 64 page writes (for the
      individual pages that were first written to the doublewrite buffer).
      With that commit, we may submit up to 128 page writes at a time.
      
      The maximum number of outstanding requests per thread is 256.
      Because the maximum number of asynchronous write submissions per
      thread was roughly doubled, it is now possible that
      buf_dblwr_t::flush_buffered_writes_completed() will hang in
      io_slots::acquire(), called via os_aio() and fil_space_t::io(),
      when submitting writes of the individual blocks.
      
      We will prevent this type of hang by increasing the minimum number
      of innodb_write_io_threads from 1 to 2, so that this type of hang
      would only become possible when 512 outstanding write requests
      are exceeded.
      17d3f856
  11. 11 Dec, 2020 1 commit
    • Marko Mäkelä's avatar
      MDEV-24391 heap-use-after-free in fil_space_t::flush_low() · 8677c14e
      Marko Mäkelä authored
      We observed a race condition that involved two threads
      executing fil_flush_file_spaces() and one thread
      executing fil_delete_tablespace(). After one of the
      fil_flush_file_spaces() observed that
      space.needs_flush_not_stopping() is set and was
      releasing the fil_system.mutex, the other fil_flush_file_spaces()
      would complete the execution of fil_space_t::flush_low() on
      the same tablespace. Then, fil_delete_tablespace() would
      destroy the object, because the value of fil_space_t::n_pending
      did not prevent that. Finally, the fil_flush_file_spaces() would
      resume execution and invoke fil_space_t::flush_low() on the freed
      object.
      
      This race condition was introduced in
      commit 118e258a of MDEV-23855.
      
      fil_space_t::flush(): Add a template parameter that indicates
      whether the caller is holding a reference to prevent the
      tablespace from being freed.
      
      buf_dblwr_t::flush_buffered_writes_completed(),
      row_quiesce_table_start(): Acquire a reference for the duration
      of the fil_space_t::flush_low() operation. It should be impossible
      for the object to be freed in these code paths, but we want to
      satisfy the debug assertions.
      
      fil_space_t::flush_low(): Do not increment or decrement the
      reference count, but instead assert that the caller is holding
      a reference.
      
      fil_space_extend_must_retry(), fil_flush_file_spaces():
      Acquire a reference before releasing fil_system.mutex.
      This is what will fix the race condition.
      8677c14e