- 07 May, 2024 1 commit
-
-
Kristian Nielsen authored
-
- 06 May, 2024 1 commit
-
-
Yuchen Pei authored
In the absence of insight of the cause of spider.spider_fixes_part failure as described in MDEV-30929, This is a workaround, which could help narrow the possibility down to whether slave SQL thread attempts to read from file that maybe not yet on disk. It does not otherwise affect the coverage of the test. I have pushed this commit 4 times, but have yet to encounter the failure as described in MDEV-30929, so it could also fix the test and stop the CI pollution. Also replaced START SLAVE; with --source include/start_slave.inc inside the slave_test_init.inc files.
-
- 05 May, 2024 2 commits
-
-
Kristian Nielsen authored
MDEV-34042: Deadlock kill of XA PREPARE can break replication / rpl.rpl_parallel_multi_domain_xa sporadic failure Refinement of the original patch. Move the code to reset the kill up into the parent class Xid_apply_log_event, to also fix the similar issue for XA COMMIT. Increase the number of slave retries in the test case rpl.rpl_parallel_multi_domain_xa to fix some sporadic failures. The test generates massive amounts of conflicting transactions in multiple independent domains, which can cause multiple rollback+retry for a transaction as it conflicts with transactions in other domains one-by-one. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
Kristian Nielsen authored
Don't deadlock kill event groups in other domains if they are not SPECULATE_OPTIMISTIC. Such event groups may not be able to safely roll back and retry (eg. DDL). But do deadlock kill a transaction T2 from a blocked transaction U in another domain, even if T2 has lower sub_id than U. Otherwise, in case of a cycle T2->T1->U->T2, we might not break the cycle if U is not SPECULATE_OPTIMISTIC Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
- 02 May, 2024 5 commits
-
-
Sergei Golubchik authored
CURRENT_TEST: binlog_encryption.rpl_parallel_gco_wait_kill mysqltest: In included file "./suite/rpl/t/rpl_parallel_gco_wait_kill.test": included from /home/buildbot/amd64-ubuntu-2004-debug/build/mysql-test/suite/binlog_encryption/rpl_parallel_gco_wait_kill.test at line 2: At line 334: Can't initialize replace from 'replace_result $thd_id THD_ID' An sql thread can reach the "Slave has read all relay log" state and then start reading relay log again. Let's use a more generic pattern to retrieve the sql thread ID even if it's not in the "read all relay log" state.
-
Kristian Nielsen authored
MDEV-34042: Deadlock kill of XA PREPARE can break replication / rpl.rpl_parallel_multi_domain_xa sporadic failure Clear any pending deadlock kill after completing XA PREPARE, and before updating the mysql.gtid_slave_pos table in a separate transaction. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
Kristian Nielsen authored
One case is conflicting transactions T1 and T2 with different domain id, in optimistic parallel replication in non-GTID mode. Then T2 will wait_for_prior_commit on T1; and if T1 got a row lock wait on T2 it would hang, as different domains caused the deadlock kill to be skipped in thd_rpl_deadlock_check(). More generally, if we have transactions T1 and T2 in one domain/master connection, and independent transactions U in another, then we can still deadlock like this: T1 row low wait on U U row lock wait on T2 T2 wait_for_prior_commit on T1 This commit enforces the deadlock kill in these cases. If the waited-for transaction is speculatively applied, then it will be deadlock killed in case of a conflict, even if the two transactions are in different domains or master connections. Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
mariadb-DebarunBanerjee authored
Issue: When getting a page (buf_page_get_gen) with no latch option (RW_NO_LATCH), the caller is not expected to follow the B-tree latching order. However in buf_page_get_low we try to acquire shared page latch unconditionally to wait for a page that is being loaded by another thread concurrently. In general it could lead to latch order violation and deadlock. Currently it affects the change buffer insert path btr_latch_prev() which tries to load the previous page out of order with RW_NO_LATCH and two concurrent inserts into IBUF tree cause deadlock. This problem is introduced in 10.6 by following commit. commit 9436c778 (MDEV-27058) Fix: While trying to latch a page with RW_NO_LATCH, always use the "*lock_try" interface and retry operation on failure after unfixing the page.
-
Sergei Golubchik authored
wait for all connections to disconnect before the cleanup
-
- 30 Apr, 2024 8 commits
-
-
Sergei Golubchik authored
-
Thirunarayanan Balathandayuthapani authored
Problem: ======= During InnoDB non-rebuild online alter operation, InnoDB set the dummy log to clustered index online log. This can be used by concurrent DML to identify whether the table undergoes online DDL. InnoDB fails to reset the dummy log of clustered index in case of error happened during prepare phase. Solution: ======== Reset the InnoDB clustered index online log in case of error during prepare phase.
-
Sergei Golubchik authored
they aren't robust enough and can easily apply incorrectly (this fixes the failure of innodb.insert_into_empty,4k after the merge)
-
Sergei Golubchik authored
-
Andrei authored
The test's header is not written to follow strictly a correct order of checks by mtr at test start which may lead to an error. E.g ./mtr --mysqld=--binlog-format=row rpl.rpl_using_gtid_default to At line 175: query 'SET GLOBAL gtid_slave_pos= ""' failed: ER_SLAVE_MUST_STOP (1198): This operation cannot be performed as you have a running slave ''; run STOP SLAVE '' first Fixed to require the binlog format first in the test header.
-
Andrei authored
rpl.rpl_heartbeat turns out to miss a standard include/master-slave header which made it potentially in BB and actually with manual mtr failing as it may have used a previous slave GTID state. Fixed with installing the standard rpl suite header/footer in the test file.
-
Monty authored
The problem was that the signal thread was not killed when using unireg_abort(). The bug was introduced by: MDEV-30260: Slave crashed:reload_acl_and_cache during shutdown Other things fixed: - Don't produce memory leaks with safemalloc if all threads was not ended properly (not useful)
-
Tuukka Pasanen authored
Let dh_systemd handle most of the systemd side and get rid of custom scripts Rework installation of systemd service and socket files base on Michael Biebl merge request: https://salsa.debian.org/mariadb-team/mariadb-server/-/merge_requests/63 https://salsa.debian.org/mariadb-team/mariadb-server/-/merge_requests/75
-
- 29 Apr, 2024 3 commits
-
-
Sergei Golubchik authored
-
Yuchen Pei authored
We have to #undef my_error and find it from udfs when spider is not installed.
-
mariadb-DebarunBanerjee authored
This is a server hang and not an issue with backup. While concurrent DDLs in server gets in hanged state, mariabackup waits for DDLs to finish trying to acquire MDL_BACKUP_BLOCK_DDL. The server hang is serious in nature and caused by thread pool state being incorrectly set to thread creation pending state while no creation is actually pending. Once a thread pool reaches such state no new thread gets created in the pool. While it could possibly affect all thread pools in server, the innodb thread pool is the victim in current bug where IO job gets blocked when the pool is stuck with much less number of threads than intended. Available workers are blocked in purge waiting for page lock to be released by IO write (SX lock) causing a complete deadlock. The issue is caused by the state variable m_thread_creation_pending introduced by MDEV-31095: 9e62ab7a. We check and set the variable early while attempting to create a new thread in pool but fail to reset it if we exit the flow for other reasons like maximum threads reached or get into thread creation throttling path. Fix: The simple fix is to make sure that the state is reset back in case we don't actually attempt to create the thread.
-
- 28 Apr, 2024 2 commits
-
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
pcre2 - fix CMAKE_C_FLAGS for MSVC for external project by Vladislav Vaintroub <vvaintroub@gmail.com>
-
- 27 Apr, 2024 1 commit
-
-
Alexander Barkov authored
MDEV-33534 UBSAN: Negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_double_round from sql/item_func.cc| The negation in this line: ulonglong abs_dec= dec_negative ? -dec : dec; did not take into account that 'dec' can be the smallest possible signed negative value -9223372036854775808. Its negation is an operation with an undefined behavior. Fixing the code to use Longlong_hybrid, which implements a safe method to get an absolute value.
-
- 26 Apr, 2024 5 commits
-
-
Sergei Golubchik authored
it's a slow test, the slave needs to catch up, reading >1500 transactions. A default MASTER_GTID_WAIT() timeout in sync_with_master_gtid.inc is 120 seconds, which might be not enough for a slow/overloaded slave. Let's wait forever or until ./mtr --testcase-timeout, whatever comes first.
-
Hugo Wen authored
Previously, when running mysqlbinlog without providing a binlog file, it would print the entire help text, which was very verbose and made it difficult to identify the actual issue. Now change the behavior to print a more concise error message instead: "ERROR: Please provide the log file(s). Run with '--help' for usage instructions." This makes the error output more user-friendly and easier to understand, especially when running the tool in scripts or automated processes. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
-
Daniele Sciascia authored
0ccdf54b removed stack allocated THD objects from functions Wsrep_schema::replay_transaction(). However, it inadvertedly anticipated the destruction of the THD, causing assertions and usage of THD after it was destroyed. The fix consists in extracting the original function into a separate function, and leave the allocation and destruction of the THD object in Wsrep_schema::replay_transaction(), making sure that using the heap allocated THD has no side effects. Same for Wsrep_schema::recover_sr_transactions(). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Sergei Golubchik authored
followup for 02715174
-
Oleksandr Byelkin authored
-
- 25 Apr, 2024 8 commits
-
-
Jan Lindström authored
Based on logs we might start SST before donor has reached Primary state. Because this test shutdowns all nodes we need to make sure when we start nodes that previous nodes have reached Primary state and joined the cluster. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Marko Mäkelä authored
mtr_t::commit_shrink(): Do not assert that some previously clean pages will be flagged as modified by this mini-transaction. It could be the case that there had been no recent write-back of any of the undo tablespace pages that we are modifying when truncating the tablespace. It suffices to assert that some pages were modified again: ut_ad(m_modifications). This fixes up commit f5fddae3
-
Sergei Golubchik authored
the test waits for the event to get stuck on MASTER_DELAY, but on a slow/overloaded slave the event might pass MASTER_DELAY before the test starts waiting. Wait for the event to get stuck on the LOCK TABLES (after MASTER_DELAY), the event cannot avoid that,
-
Marko Mäkelä authored
commit_try_norebuild(): Add the parameter statistics_exist, similar to commit_try_rebuild(). If the InnoDB statistics tables did not exist, we will not attempt to update statistics later on during the transaction. Thanks to Matthias Leich for originally reproducing this scenario.
-
Kristian Nielsen authored
The test could fail with a duplicate key error because switching to non-GTID mode could start at the wrong old-style position. The position could be wrong when the previous GTID connect was stopped before receiving the fake GTID list event which gives the old-style position corresponding to the GTID connected position. Work-around by injecting an extra event and syncing the slave before switching to non-GTID mode. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
Marko Mäkelä authored
Starting with GCC 10, let us enable _GLIBCXX_DEBUG as well as _GLIBCXX_ASSERTIONS which have an impact on the GNU libstdc++. On GCC 8, we observed a compilation failure related to some missing type conversion. Even though clang on GNU/Linux would default to using libstdc++ and enabling the debugging seems to work with clang-18, we will not enable this on clang, in case it would lead to compilation errors. For the clang libc++ before clang-15 there was _LIBCPP_DEBUG, but according to llvm/llvm-project@f3966eaf869b7bdd9113ab9d5b78469eb0f5f028 and llvm/llvm-project@13ea1343231fa4ae12fe9fba4c789728465783d7 and llvm/llvm-project@ff573a42cd1f1d05508f165dc3e645a0ec17edb5 it looks like that for proper results, a specially built debug version of libc++ would have to be used in order to enable equivalent checks. This should help catch bugs like the one that commit 455a15fd fixed. Reviewed by: Sergei Golubchik
-
Thirunarayanan Balathandayuthapani authored
Problem: ======== - Partition update operation enables the bulk insert for the transaction while moving the row between partitions. This leads to debug assert failure while removing the row from one of the partition. Solution: ======== - Disallow the bulk insert operation for non-insert operation of partition table.
-
Marko Mäkelä authored
While commit 75b7cd68 was a significant improvement, we occasionally got test failures of debug builds. One of the affected tests is innodb.innodb-64k-crash.
-
- 24 Apr, 2024 4 commits
-
-
Sergei Golubchik authored
and put master-slave.inc *last* in the series of includes
-
Sergei Golubchik authored
MDL wait consists of short 1 second waits (this is not configurable) repeated until lock_wait_timeout is reached. The stage is changed to Waiting and back every second. To have predictable result in the test the query should filter all sequences of X, "Waiting for MDL", X, leaving just X.
-
Sergei Golubchik authored
it allocates 1GB of memory, it causes failures in CI
-
Sergei Golubchik authored
like other galera tests do
-