1. 11 Nov, 2021 3 commits
  2. 09 Nov, 2021 7 commits
  3. 08 Nov, 2021 4 commits
  4. 05 Nov, 2021 2 commits
    • Oleksandr Byelkin's avatar
      Merge branch '10.2' into 10.3 · a2f147af
      Oleksandr Byelkin authored
      a2f147af
    • Andrei Elkin's avatar
      MDEV-26833 Missed statement rollback in case transaction drops or create temporary table · 561b6c7e
      Andrei Elkin authored
      When transaction creates or drops temporary tables and afterward its statement
      faces an error even the transactional table statement's cached ROW
      format events get involved into binlog and are visible after the transaction's commit.
      
      Fixed with proper analysis of whether the errored-out statement needs
      to be rolled back in binlog.
      For instance a fact of already cached CREATE or DROP for temporary
      tables by previous statements alone
      does not cause to retain the being errored-out statement events in the
      cache.
      Conversely, if the statement creates or drops a temporary table
      itself it can't be rolled back - this rule remains.
      561b6c7e
  5. 04 Nov, 2021 1 commit
  6. 03 Nov, 2021 2 commits
  7. 02 Nov, 2021 10 commits
  8. 01 Nov, 2021 2 commits
    • Jan Lindström's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · ea239034
      Jan Lindström authored
      * Fix error handling NULL-pointer reference
      * Add mtr-suppression on galera_ssl_upgrade
      ea239034
    • Marko Mäkelä's avatar
      MDEV-26949 --debug-gdb installs redundant signal handlers · 026984c3
      Marko Mäkelä authored
      There is a server startup option --gdb a.k.a. --debug-gdb that requests
      signals to be set for more convenient debugging. Most notably, SIGINT
      (ctrl-c) will not be ignored, and you will be able to interrupt the
      execution of the server while GDB is attached to it.
      
      When we are debugging, the signal handlers that would normally display
      a terse stack trace are useless.
      
      When we are debugging with rr, the signal handlers may interfere with
      a SIGKILL that could be sent to the process by the environment, and ruin
      the rr replay trace, due to a Linux kernel bug
      https://lkml.org/lkml/2021/10/31/311
      
      To be able to diagnose bugs in kill+restart tests, we may really need
      both a trace before the SIGKILL and a trace of the failure after a
      subsequent server startup. So, we had better avoid hitting the problem
      by simply not installing those signal handlers.
      026984c3
  9. 30 Oct, 2021 2 commits
  10. 29 Oct, 2021 6 commits
    • Oleksandr Byelkin's avatar
      Merge branch '10.2' into 10.3 · 6953af36
      Oleksandr Byelkin authored
      6953af36
    • Alexander Barkov's avatar
      MDEV-24901 SIGSEGV in fts_get_table_name, SIGSEGV in ib_vector_size, SIGSEGV... · 059797ed
      Alexander Barkov authored
      MDEV-24901 SIGSEGV in fts_get_table_name, SIGSEGV in ib_vector_size, SIGSEGV in row_merge_fts_doc_tokenize, stack smashing
      
      strmake() puts one extra 0x00 byte at the end of the string.
      The code in my_strnxfrm_tis620[_nopad] did not take this into
      account, so in the reported scenario the 0x00 byte was put outside
      of a stack variable, which made ASAN crash.
      
      This problem is already fixed in in MySQL:
      
        commit 19bd66fe43c41f0bde5f36bc6b455a46693069fb
        Author: bin.x.su@oracle.com <>
        Date:   Fri Apr 4 11:35:27 2014 +0800
      
      But the fix does not seem to be correct, as it breaks when finds a zero byte
      in the source string.
      
      Using memcpy() instead of strmake().
      
      - Unlike strmake(), memcpy() it does not write beyond the destination
        size passed.
      - Unlike the MySQL fix, memcpy() does not break on the first 0x00 byte found
        in the source string.
      059797ed
    • sjaakola's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · 157b3a63
      sjaakola authored
      Mutex order violation when wsrep bf thread kills a conflicting trx,
      the stack is
      
                wsrep_thd_LOCK()
                wsrep_kill_victim()
                lock_rec_other_has_conflicting()
                lock_clust_rec_read_check_and_lock()
                row_search_mvcc()
                ha_innobase::index_read()
                ha_innobase::rnd_pos()
                handler::ha_rnd_pos()
                handler::rnd_pos_by_record()
                handler::ha_rnd_pos_by_record()
                Rows_log_event::find_row()
                Update_rows_log_event::do_exec_row()
                Rows_log_event::do_apply_event()
                Log_event::apply_event()
                wsrep_apply_events()
      
      and mutexes are taken in the order
      
                lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
      
      When a normal KILL statement is executed, the stack is
      
                innobase_kill_query()
                kill_handlerton()
                plugin_foreach_with_mask()
                ha_kill_query()
                THD::awake()
                kill_one_thread()
      
              and mutexes are
      
                victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
      
      This patch is the plan D variant for fixing potetial mutex locking
      order exercised by BF aborting and KILL command execution.
      
      In this approach, KILL command is replicated as TOI operation.
      This guarantees total isolation for the KILL command execution
      in the first node: there is no concurrent replication applying
      and no concurrent DDL executing. Therefore there is no risk of
      BF aborting to happen in parallel with KILL command execution
      either. Potential mutex deadlocks between the different mutex
      access paths with KILL command execution and BF aborting cannot
      therefore happen.
      
      TOI replication is used, in this approach,  purely as means
      to provide isolated KILL command execution in the first node.
      KILL command should not (and must not) be applied in secondary
      nodes. In this patch, we make this sure by skipping KILL
      execution in secondary nodes, in applying phase, where we
      bail out if applier thread is trying to execute KILL command.
      This is effective, but skipping the applying of KILL command
      could happen much earlier as well.
      
      This also fixed unprotected calls to wsrep_thd_abort
      that will use wsrep_abort_transaction. This is fixed
      by holding THD::LOCK_thd_data while we abort transaction.
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      157b3a63
    • Jan Lindström's avatar
      MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) · 30337add
      Jan Lindström authored
      Revert "MDEV-23328 Server hang due to Galera lock conflict resolution"
      
      This reverts commit 29bbcac0.
      30337add
    • sjaakola's avatar
      MDEV-23328 Server hang due to Galera lock conflict resolution · db50ea3a
      sjaakola authored
      Mutex order violation when wsrep bf thread kills a conflicting trx,
      the stack is
      
                wsrep_thd_LOCK()
                wsrep_kill_victim()
                lock_rec_other_has_conflicting()
                lock_clust_rec_read_check_and_lock()
                row_search_mvcc()
                ha_innobase::index_read()
                ha_innobase::rnd_pos()
                handler::ha_rnd_pos()
                handler::rnd_pos_by_record()
                handler::ha_rnd_pos_by_record()
                Rows_log_event::find_row()
                Update_rows_log_event::do_exec_row()
                Rows_log_event::do_apply_event()
                Log_event::apply_event()
                wsrep_apply_events()
      
      and mutexes are taken in the order
      
                lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
      
      When a normal KILL statement is executed, the stack is
      
                innobase_kill_query()
                kill_handlerton()
                plugin_foreach_with_mask()
                ha_kill_query()
                THD::awake()
                kill_one_thread()
      
              and mutexes are
      
                victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
      
      This patch is the plan D variant for fixing potetial mutex locking
      order exercised by BF aborting and KILL command execution.
      
      In this approach, KILL command is replicated as TOI operation.
      This guarantees total isolation for the KILL command execution
      in the first node: there is no concurrent replication applying
      and no concurrent DDL executing. Therefore there is no risk of
      BF aborting to happen in parallel with KILL command execution
      either. Potential mutex deadlocks between the different mutex
      access paths with KILL command execution and BF aborting cannot
      therefore happen.
      
      TOI replication is used, in this approach,  purely as means
      to provide isolated KILL command execution in the first node.
      KILL command should not (and must not) be applied in secondary
      nodes. In this patch, we make this sure by skipping KILL
      execution in secondary nodes, in applying phase, where we
      bail out if applier thread is trying to execute KILL command.
      This is effective, but skipping the applying of KILL command
      could happen much earlier as well.
      
      This also fixed unprotected calls to wsrep_thd_abort
      that will use wsrep_abort_transaction. This is fixed
      by holding THD::LOCK_thd_data while we abort transaction.
      Reviewed-by: default avatarJan Lindström <jan.lindstrom@mariadb.com>
      db50ea3a
    • Jan Lindström's avatar
      MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL) · c8b39f7e
      Jan Lindström authored
      Revert "MDEV-23328 Server hang due to Galera lock conflict resolution"
      
      This reverts commit 29bbcac0.
      c8b39f7e
  11. 28 Oct, 2021 1 commit
    • Andrei Elkin's avatar
      MDEV-26833 Missed statement rollback in case transaction drops or create temporary table · 42ae7659
      Andrei Elkin authored
      When transaction creates or drops temporary tables and afterward its statement
      faces an error even the transactional table statement's cached ROW
      format events get involved into binlog and are visible after the transaction's commit.
      
      Fixed with proper analysis of whether the errored-out statement needs
      to be rolled back in binlog.
      For instance a fact of already cached CREATE or DROP for temporary
      tables by previous statements alone
      does not cause to retain the being errored-out statement events in the
      cache.
      Conversely, if the statement creates or drops a temporary table
      itself it can't be rolled back - this rule remains.
      42ae7659