- 30 Oct, 2021 2 commits
-
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
- 29 Oct, 2021 17 commits
-
-
sjaakola authored
Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
Jan Lindström authored
Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit eac8341d.
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Marko Mäkelä authored
row_undo_mod_clust_low(): If we are in recovery and rolling back a DELETE operation on the SYS_INDEXES table, and the SYS_INDEXES.NAME starts with the magic byte 0xff that identifies uncommitted ADD INDEX stubs, we must not try to evict the table definition because such index stubs would be skipped by dict_load_indexes() anyway.
-
Marko Mäkelä authored
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Marko Mäkelä authored
MDEV-25683 Atomic DDL: With innodb_force_recovery=3 InnoDB: Trying to load index but the index tree has been freed The purpose of the parameter innodb_force_recovery is to allow some data to be dumped from a corrupted database. Its values used to be as follows: innodb_force_recovery=0: normal (default) innodb_force_recovery=1: ignore (skip log for) corrupted pages or missing data files when applying the redo log innodb_force_recovery=2: additionally, disable background tasks (such as the purge of committed undo logs) innodb_force_recovery=3: additionally, disable the rollback of recovered incomplete (not committed or XA PREPARE) transactions innodb_force_recovery=4: same as 3 (since MDEV-19514 in MariaDB 10.5) innodb_force_recovery=5: additionally, do not process any undo log, disallow any writes, and force READ UNCOMMITTED isolation level innodb_force_recovery=6: additionally, pretend that ib_logfile0 does not exist (prevent any recovery). Never use this! The bad thing that happens with innodb_force_recovery=3 and innodb_force_recovery=4 is that also the rollback of any recovered DDL transaction will be skipped. This would break the DDL log recovery that was introduced in MDEV-17567. For one data directory sample, the DDL log recovery would hangs due to a conflict on the InnoDB SYS_TABLES table, because the lock holder transaction was not rolled back due to innodb_force_recovery=3. Fix: Make innodb_force_recovery=3 skip the DML transaction rollback only, and make innodb_force_recovery=4 (renamed to SRV_FORCE_NO_DDL_UNDO) behave like innodb_force_recovery=3 used to (skip the rollback of all recovered transactions, both DML and DDL). Startup with innodb_force_recovery=4 will be unaffected by this change. (There may be hangs, possibly preceded by messages about failing to load an index.) Side note: With innodb_force_recovery=5, any DDL log for InnoDB tables will be essentially ignored by InnoDB, but the server will start up.
-
sjaakola authored
Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
Jan Lindström authored
Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit 29bbcac0.
-
sjaakola authored
Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
Jan Lindström authored
Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit eac8341d.
-
sjaakola authored
Mutex order violation when wsrep bf thread kills a conflicting trx, the stack is wsrep_thd_LOCK() wsrep_kill_victim() lock_rec_other_has_conflicting() lock_clust_rec_read_check_and_lock() row_search_mvcc() ha_innobase::index_read() ha_innobase::rnd_pos() handler::ha_rnd_pos() handler::rnd_pos_by_record() handler::ha_rnd_pos_by_record() Rows_log_event::find_row() Update_rows_log_event::do_exec_row() Rows_log_event::do_apply_event() Log_event::apply_event() wsrep_apply_events() and mutexes are taken in the order lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data When a normal KILL statement is executed, the stack is innobase_kill_query() kill_handlerton() plugin_foreach_with_mask() ha_kill_query() THD::awake() kill_one_thread() and mutexes are victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex This patch is the plan D variant for fixing potetial mutex locking order exercised by BF aborting and KILL command execution. In this approach, KILL command is replicated as TOI operation. This guarantees total isolation for the KILL command execution in the first node: there is no concurrent replication applying and no concurrent DDL executing. Therefore there is no risk of BF aborting to happen in parallel with KILL command execution either. Potential mutex deadlocks between the different mutex access paths with KILL command execution and BF aborting cannot therefore happen. TOI replication is used, in this approach, purely as means to provide isolated KILL command execution in the first node. KILL command should not (and must not) be applied in secondary nodes. In this patch, we make this sure by skipping KILL execution in secondary nodes, in applying phase, where we bail out if applier thread is trying to execute KILL command. This is effective, but skipping the applying of KILL command could happen much earlier as well. This also fixed unprotected calls to wsrep_thd_abort that will use wsrep_abort_transaction. This is fixed by holding THD::LOCK_thd_data while we abort transaction. Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
-
Jan Lindström authored
Revert "MDEV-23328 Server hang due to Galera lock conflict resolution" This reverts commit 29bbcac0.
-
- 28 Oct, 2021 15 commits
-
-
Vladislav Vaintroub authored
Fix by removing the trigger. It does not do anything useful anyway.
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
-
Sergei Golubchik authored
for example: sql/sql_prepare.cc:5714:63: error: 'static void Ed_result_set::operator delete(void*, MEM_ROOT*)' called on pointer returned from a mismatched allocation function [-Werror=mismatched-new-delete]
-
Oleksandr Byelkin authored
-
Vladislav Vaintroub authored
Those messages don't indicate errors, they should be normal warnings.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
The InnoDB changes in MySQL 5.7.36 that were applicable to MariaDB were covered by MDEV-26864, MDEV-26865, MDEV-26866.
-
Nikita Malyavin authored
The initial test case for MySQL Bug #33053297 is based on mysql/mysql-server@27130e25078864b010d81266f9613d389d4a229b. innobase_get_field_from_update_vector is not a suitable function to fetch updated row info, as well as parent table's update vector is not always suitable. For instance, in case of DELETE it contains undefined data. castade->update vector seems to be good enough to fetch all base columns update data, and besides faster, and less error-prone.
-
Julius Goryavsky authored
In the replication-related code, in the exec_relay_log_event() (slave.cc) function, where the "data_lock" mutex is captured, this mutex is then not released on one of the early return branches within a specific insert for WSREP, namely under the branch: "if (wsrep_before_statement(thd))". As a result, the mutex remains captured, resulting in errors or hangs. This commit fixes this issue, which is now showing up as intermittent failures in mtr tests for galera and galera_sr suites.
-
- 27 Oct, 2021 6 commits
-
-
Marko Mäkelä authored
Similar to commit f7684f0c (MDEV-26855) we will try to enable the adaptive spinloop for lock_sys.wait_mutex on ARMv8. Enabling any form of spinloop for lock_sys.wait_mutex did not show a significant improvement in our tests on AMD64. Spinning can be argued to be a hack to reduce the impact on mutex contention. It would be better to adjust the code to reduce contention in the first place.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
-
Sergei Petrunia authored
ha_rocksdb.h:459:15: warning: 'table_type' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
-
Marko Mäkelä authored
-