1. 27 Jan, 2022 1 commit
    • Marko Mäkelä's avatar
      MDEV-27026 innodb_fts.concurrent_insert failed · 48b974b2
      Marko Mäkelä authored
      Most of this was likely already fixed by MDEV-27017.
      On one implementation of the AMD64 ISA, the test
      innodb_fts.concurrent_insert would still occasionally hang,
      with both dict_sys_t::evict_table_LRU() and
      dict_index_set_merge_threshold() waiting in dict_sys_t::lock()
      several threads waiting in dict_sys_t::freeze(),
      no thread holding exclusive dict_sys.latch and also no thread
      in the stack traces apparently holding any dict_sys.latch,
      even though dict_sys.latch_readers == 1.
      
      To prevent this scenario, we will remove the dict_sys.latch
      acquisition from dict_index_set_merge_threshold(). It is actually
      not needed, because dict_sys.sys_indexes will not change after
      InnoDB startup. The SYS_INDEXES leaf page will be sufficiently
      protected by the page latch.
      
      There potentially is a bug in the srw_lock implementation,
      which will have to be investigated further.
      48b974b2
  2. 26 Jan, 2022 1 commit
  3. 20 Jan, 2022 2 commits
  4. 19 Jan, 2022 3 commits
    • Marko Mäkelä's avatar
      MDEV-27499 fixup: Add a wait to buf_flush_sync() · 764ca7e6
      Marko Mäkelä authored
      The test innodb.log_file_size would occasionally fail with
      an assertion failure !buf_pool.any_io_pending(). Let us wait
      for the page cleaner thread to become idle already in
      srv_prepare_to_delete_redo_log_file(), like we used to.
      764ca7e6
    • Sergei Petrunia's avatar
      MDEV-27382: OFFSET is ignored when combined with DISTINCT · 7259b299
      Sergei Petrunia authored
      A query in form
      
        SELECT DISTINCT expr_that_is_inferred_to_be_const LIMIT 0 OFFSET n
      
      produces one row when it should produce none. The issue was in
      JOIN_TAB::remove_duplicates() in the piece of logic that tried to
      avoid duplicate removal for such cases but didn't account for possible
      "LIMIT 0".
      
      Fixed by making Select_limit_counters::set_limit() change OFFSET to 0
      when LIMIT is 0.
      7259b299
    • Marko Mäkelä's avatar
      MDEV-27025: Null merge 10.5 into 10.6 · 965c0d22
      Marko Mäkelä authored
      965c0d22
  5. 18 Jan, 2022 4 commits
    • Vlad Lesin's avatar
      MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock · be811386
      Vlad Lesin authored
      The code was backported from 10.6 bd03c0e5
      commit. See that commit message for details.
      
      Apart from the above commit trx_lock_t::wait_trx was also backported from
      MDEV-24738. trx_lock_t::wait_trx is protected with lock_sys.wait_mutex
      in 10.6, but that mutex was implemented only in MDEV-24789. As there is no
      need to backport MDEV-24789 for MDEV-27025,
      trx_lock_t::wait_trx is protected with the same mutexes as
      trx_lock_t::wait_lock.
      
      This fix should not break innodb-lock-schedule-algorithm=VATS. This
      algorithm uses an Eldest-Transaction-First (ETF) heuristic, which prefers
      older transactions over new ones. In this fix we just insert granted lock
      just before the last granted lock of the same transaction, what does not
      change transactions execution order.
      
      The changes in lock_rec_create_low() should not break Galera Cluster,
      there is a big "if" branch for WSREP. This branch is necessary to provide
      the correct transactions execution order, and should not be changed for
      the current bug fix.
      be811386
    • Vlad Lesin's avatar
      MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock · bd03c0e5
      Vlad Lesin authored
      When lock is checked for conflict, ignore other locks on the record if
      they wait for the requesting transaction.
      
      lock_rec_has_to_wait_in_queue() iterates not all locks for
      the page, but only the locks located before the waiting lock in the
      queue. So there is some invariant - any lock in the queue can wait only
      lock which is located before the waiting lock in the queue.
      
      In the case when conflicting lock waits for the transaction of
      requesting lock, we need to place the requesting lock before the waiting
      lock in the queue to preserve the invariant. That is why we are looking
      for the first waiting for requesting transation lock and place the new
      lock just after the last granted requesting transaction lock before the
      first waiting for requesting transaction lock.
      
      Example:
      
      trx1 waiting lock, trx1 granted lock, ..., trx2 lock - waiting for trx1
      place new lock here -----------------^
      
      There are also implicit locks which are lazily converted to explicit
      ones, and we need to place the newly created explicit lock to the correct
      place in a queue. All explicit locks converted from implicit ones are
      placed just after the last non-waiting lock of the same transaction before
      the first waiting for the transaction lock.
      
      Code review and cleanup was made by Marko Mäkelä.
      bd03c0e5
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 1abc476f
      Marko Mäkelä authored
      1abc476f
    • Marko Mäkelä's avatar
      MDEV-27499 Performance regression in log_checkpoint_margin() · e44439ab
      Marko Mäkelä authored
      In commit 4c3ad244 (MDEV-27416)
      an unnecessarily strict wait condition was introduced in the
      function buf_flush_wait(). Most callers actually only care that
      the pages have been flushed, not that a checkpoint has completed.
      
      Only in the buf_flush_sync() call for log resizing, we might care
      about the log checkpoint. But, in fact,
      srv_prepare_to_delete_redo_log_file() is explicitly disabling
      checkpoints. So, we can simply remove the unnecessary wait loop.
      
      Thanks to Krunal Bauskar for reporting this performance regression
      that we failed to repeat in our testing.
      e44439ab
  6. 17 Jan, 2022 4 commits
  7. 15 Jan, 2022 3 commits
  8. 14 Jan, 2022 3 commits
    • Marko Mäkelä's avatar
      Remove FIXME comments that refer to an early MDEV-14425 plan · 8535c260
      Marko Mäkelä authored
      In MDEV-14425, an early plan was to introduce a separate log file
      for file-level records and checkpoint information. The reasoning was
      that fil_system.mutex contention would be reduced by not having to
      maintain fil_system.named_spaces. The mutex contention was actually
      fixed in MDEV-23855 by making some data fields in fil_space_t and
      fil_node_t use std::atomic.
      
      Using a single circular log file simplifies recovery and backup.
      8535c260
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 16b87f98
      Marko Mäkelä authored
      16b87f98
    • Marko Mäkelä's avatar
      MDEV-27500 buf_page_free() fails to drop the adaptive hash index · c104a01b
      Marko Mäkelä authored
      The function buf_page_free() that was introduced
      in commit a35b4ae8 (MDEV-15528)
      failed to remove any adaptive hash index entries for the page
      before freeing the page.
      
      This caused an assertion failure on shutdown of 10.6 server of
      in the function buf_pool_t::clear_hash_index() with the expression:
      (s >= buf_page_t::UNFIXED || s == buf_page_t::REMOVE_HASH).
      The assertion would fail for a block that is in the freed state.
      
      The failing assertion was added in
      commit aaef2e1d
      in the 10.6 branch.
      
      Thanks to Matthias Leich for finding the bug and testing the fix.
      c104a01b
  9. 13 Jan, 2022 1 commit
    • Marko Mäkelä's avatar
      MDEV-27058 fixup: Bogus assertion !block->page.is_io_fixed() · e6a06113
      Marko Mäkelä authored
      buf_page_get_gen(): After recv_sys_t::recover_low() returned,
      the page must not be read-fixed, but it may be write-fixed,
      because the io-fix state is protected by block->page.lock,
      which we are not holding yet.
      
      Also, let us copy the block descriptor state to a local variable
      for examination, so that in case an assertion would fail again,
      we will have the sampled state in the core dump. In a core dump of
      the assertion failure, we had block->page.fix() == buf_page_t::UNFIXED,
      that is, the assertion expression was holding again.
      e6a06113
  10. 12 Jan, 2022 4 commits
  11. 11 Jan, 2022 1 commit
    • Eugene Kosov's avatar
      MDEV-27022 Buffer pool is being flushed during recovery · f443cd11
      Eugene Kosov authored
      The problem was introduced by the removal of buf_pool.flush_rbt
      in commit 46b1f500 (MDEV-23399)
      
      recv_sys_t::apply(): don't write to disc and fsync() the last batch.
      Insead, sort it by oldest_modification for MariaDB server and some
      mariabackup operations.
      
      log_sort_flush_list(): a thread-safe function which sorts buf_pool::flush_list
      f443cd11
  12. 10 Jan, 2022 3 commits
    • Thirunarayanan Balathandayuthapani's avatar
      MDEV-27640 trx_has_lock_x() gives wrong result if the table has pending table lock · 428b057e
      Thirunarayanan Balathandayuthapani authored
      trx_has_lock_x() fails to find whether the trx has X-lock on the table
      when other transactions are waiting for an X or S lock on the table.
      428b057e
    • Marko Mäkelä's avatar
      Cleanup: Remove unused log_cmdq_key · fcbd3989
      Marko Mäkelä authored
      There was an intention to add a CommandQueue in
      mysql/mysql-server@eca5b0fc17a5bd6d4833d35a0d08c8549dd3b5ec
      but it never appeared in any release (not even MySQL 5.7.3
      where that commit appeared).
      fcbd3989
    • Rucha Deodhar's avatar
      MDEV-23836: Assertion `! is_set() || m_can_overwrite_status' in · 81e00485
      Rucha Deodhar authored
      Diagnostics_area::set_error_status (interrupted ALTER TABLE under LOCK)
      
      Analysis: KILL_QUERY is not ignored when local memory used exceeds maximum
      session memory. Hence the query proceeds, OK is sent and we end up
      reopening tables that are marked for reopen. During this, kill status is
      eventually checked and assertion failure happens during trying to send error
      message because OK has already been sent.
      Fix: Ok is already sent so statement has already executed. It is too
      late to give error. So ignore kill.
      81e00485
  13. 09 Jan, 2022 2 commits
  14. 05 Jan, 2022 5 commits
  15. 04 Jan, 2022 3 commits
    • Marko Mäkelä's avatar
    • Marko Mäkelä's avatar
      Merge 10.5 into 10.6 · 3f572676
      Marko Mäkelä authored
      3f572676
    • Marko Mäkelä's avatar
      MDEV-27416 InnoDB hang in buf_flush_wait_flushed(), on log checkpoint · 4c3ad244
      Marko Mäkelä authored
      InnoDB could sometimes hang when triggering a log checkpoint. This is
      due to commit 7b1252c0 (MDEV-24278),
      which introduced an untimed wait to buf_flush_page_cleaner().
      
      The hang was noticed by occasional failures of IMPORT TABLESPACE tests,
      such as innodb.innodb-wl5522, which would (unnecessarily) invoke
      log_make_checkpoint() from row_import_cleanup().
      
      The reason of the hang was that buf_flush_page_cleaner() would enter
      untimed sleep despite buf_flush_sync_lsn being set. The exact failure
      scenario is unclear, because buf_flush_sync_lsn should actually be
      protected by buf_pool.flush_list_mutex. We prevent the hang by
      invoking buf_pool.page_cleaner_set_idle(false) whenever we are
      setting buf_flush_sync_lsn and signaling buf_pool.do_flush_list.
      
      The bulk of these changes was originally developed as a preparation
      for MDEV-26827, to invoke buf_flush_list() from fewer threads,
      and tested on 10.6 by Matthias Leich.
      
      This fix was tested by running 100 repetitions of 100 concurrent instances
      of the test innodb.innodb-wl5522 on a RelWithDebInfo build, using ext4fs
      and innodb_flush_method=O_DIRECT on a SATA SSD with 4096-byte block size.
      During the test, the call to log_make_checkpoint() in row_import_cleanup()
      was present.
      
      buf_flush_list(): Make static.
      
      buf_flush_wait(): Wait for buf_pool.get_oldest_modification()
      to reach a target, by work done in the buf_flush_page_cleaner.
      If buf_flush_sync_lsn is going to be set, we will invoke
      buf_pool.page_cleaner_set_idle(false).
      
      buf_flush_ahead(): If buf_flush_sync_lsn or buf_flush_async_lsn
      is going to be set and the page cleaner woken up, we will invoke
      buf_pool.page_cleaner_set_idle(false).
      
      buf_flush_wait_flushed(): Invoke buf_flush_wait().
      
      buf_flush_sync(): Invoke recv_sys.apply() at the start in case
      crash recovery is active. Invoke buf_flush_wait().
      
      buf_flush_sync_batch(): A lower-level variant of buf_flush_sync()
      that is only called by recv_sys_t::apply().
      
      buf_flush_sync_for_checkpoint(): Do not trigger log apply
      or checkpoint during recovery.
      
      buf_dblwr_t::create(): Only initiate a buffer pool flush, not
      a checkpoint.
      
      row_import_cleanup(): Do not unnecessarily invoke log_make_checkpoint().
      Invoking buf_flush_list_space() before starting to generate redo log
      for the imported tablespace should suffice.
      
      srv_prepare_to_delete_redo_log_file():
      Set recv_sys.recovery_on in order to prevent
      buf_flush_sync_for_checkpoint() from initiating a checkpoint
      while the log is inaccessible. Remove a wait loop that is already
      part of buf_flush_sync().
      Do not invoke fil_names_clear() if the log is being upgraded,
      because the FILE_MODIFY record is specific to the latest format.
      
      create_log_file(): Clear recv_sys.recovery_on only after calling
      log_make_checkpoint(), to prevent buf_flush_page_cleaner from
      invoking a checkpoint.
      
      innodb_shutdown(): Simplify the logic in mariadb-backup --prepare.
      
      os_aio_wait_until_no_pending_writes(): Update the function comment.
      Apart from row_quiesce_table_start() during FLUSH TABLES...FOR EXPORT,
      this is being called by buf_flush_list_space(), which is invoked
      by ALTER TABLE...IMPORT TABLESPACE as well as some encryption operations.
      4c3ad244