- 02 May, 2024 1 commit
-
-
mariadb-DebarunBanerjee authored
Issue: When getting a page (buf_page_get_gen) with no latch option (RW_NO_LATCH), the caller is not expected to follow the B-tree latching order. However in buf_page_get_low we try to acquire shared page latch unconditionally to wait for a page that is being loaded by another thread concurrently. In general it could lead to latch order violation and deadlock. Currently it affects the change buffer insert path btr_latch_prev() which tries to load the previous page out of order with RW_NO_LATCH and two concurrent inserts into IBUF tree cause deadlock. This problem is introduced in 10.6 by following commit. commit 9436c778 (MDEV-27058) Fix: While trying to latch a page with RW_NO_LATCH, always use the "*lock_try" interface and retry operation on failure after unfixing the page.
-
- 30 Apr, 2024 2 commits
-
-
Thirunarayanan Balathandayuthapani authored
Problem: ======= During InnoDB non-rebuild online alter operation, InnoDB set the dummy log to clustered index online log. This can be used by concurrent DML to identify whether the table undergoes online DDL. InnoDB fails to reset the dummy log of clustered index in case of error happened during prepare phase. Solution: ======== Reset the InnoDB clustered index online log in case of error during prepare phase.
-
Monty authored
The problem was that the signal thread was not killed when using unireg_abort(). The bug was introduced by: MDEV-30260: Slave crashed:reload_acl_and_cache during shutdown Other things fixed: - Don't produce memory leaks with safemalloc if all threads was not ended properly (not useful)
-
- 29 Apr, 2024 3 commits
-
-
Sergei Golubchik authored
-
Yuchen Pei authored
We have to #undef my_error and find it from udfs when spider is not installed.
-
mariadb-DebarunBanerjee authored
This is a server hang and not an issue with backup. While concurrent DDLs in server gets in hanged state, mariabackup waits for DDLs to finish trying to acquire MDL_BACKUP_BLOCK_DDL. The server hang is serious in nature and caused by thread pool state being incorrectly set to thread creation pending state while no creation is actually pending. Once a thread pool reaches such state no new thread gets created in the pool. While it could possibly affect all thread pools in server, the innodb thread pool is the victim in current bug where IO job gets blocked when the pool is stuck with much less number of threads than intended. Available workers are blocked in purge waiting for page lock to be released by IO write (SX lock) causing a complete deadlock. The issue is caused by the state variable m_thread_creation_pending introduced by MDEV-31095: 9e62ab7a. We check and set the variable early while attempting to create a new thread in pool but fail to reset it if we exit the flow for other reasons like maximum threads reached or get into thread creation throttling path. Fix: The simple fix is to make sure that the state is reset back in case we don't actually attempt to create the thread.
-
- 28 Apr, 2024 2 commits
-
-
Oleksandr Byelkin authored
-
Oleksandr Byelkin authored
pcre2 - fix CMAKE_C_FLAGS for MSVC for external project by Vladislav Vaintroub <vvaintroub@gmail.com>
-
- 27 Apr, 2024 1 commit
-
-
Alexander Barkov authored
MDEV-33534 UBSAN: Negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_double_round from sql/item_func.cc| The negation in this line: ulonglong abs_dec= dec_negative ? -dec : dec; did not take into account that 'dec' can be the smallest possible signed negative value -9223372036854775808. Its negation is an operation with an undefined behavior. Fixing the code to use Longlong_hybrid, which implements a safe method to get an absolute value.
-
- 26 Apr, 2024 2 commits
-
-
Hugo Wen authored
Previously, when running mysqlbinlog without providing a binlog file, it would print the entire help text, which was very verbose and made it difficult to identify the actual issue. Now change the behavior to print a more concise error message instead: "ERROR: Please provide the log file(s). Run with '--help' for usage instructions." This makes the error output more user-friendly and easier to understand, especially when running the tool in scripts or automated processes. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
-
Daniele Sciascia authored
0ccdf54b removed stack allocated THD objects from functions Wsrep_schema::replay_transaction(). However, it inadvertedly anticipated the destruction of the THD, causing assertions and usage of THD after it was destroyed. The fix consists in extracting the original function into a separate function, and leave the allocation and destruction of the THD object in Wsrep_schema::replay_transaction(), making sure that using the heap allocated THD has no side effects. Same for Wsrep_schema::recover_sr_transactions(). Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
- 25 Apr, 2024 7 commits
-
-
Jan Lindström authored
Based on logs we might start SST before donor has reached Primary state. Because this test shutdowns all nodes we need to make sure when we start nodes that previous nodes have reached Primary state and joined the cluster. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Marko Mäkelä authored
mtr_t::commit_shrink(): Do not assert that some previously clean pages will be flagged as modified by this mini-transaction. It could be the case that there had been no recent write-back of any of the undo tablespace pages that we are modifying when truncating the tablespace. It suffices to assert that some pages were modified again: ut_ad(m_modifications). This fixes up commit f5fddae3
-
Marko Mäkelä authored
commit_try_norebuild(): Add the parameter statistics_exist, similar to commit_try_rebuild(). If the InnoDB statistics tables did not exist, we will not attempt to update statistics later on during the transaction. Thanks to Matthias Leich for originally reproducing this scenario.
-
Kristian Nielsen authored
The test could fail with a duplicate key error because switching to non-GTID mode could start at the wrong old-style position. The position could be wrong when the previous GTID connect was stopped before receiving the fake GTID list event which gives the old-style position corresponding to the GTID connected position. Work-around by injecting an extra event and syncing the slave before switching to non-GTID mode. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
Marko Mäkelä authored
Starting with GCC 10, let us enable _GLIBCXX_DEBUG as well as _GLIBCXX_ASSERTIONS which have an impact on the GNU libstdc++. On GCC 8, we observed a compilation failure related to some missing type conversion. Even though clang on GNU/Linux would default to using libstdc++ and enabling the debugging seems to work with clang-18, we will not enable this on clang, in case it would lead to compilation errors. For the clang libc++ before clang-15 there was _LIBCPP_DEBUG, but according to llvm/llvm-project@f3966eaf869b7bdd9113ab9d5b78469eb0f5f028 and llvm/llvm-project@13ea1343231fa4ae12fe9fba4c789728465783d7 and llvm/llvm-project@ff573a42cd1f1d05508f165dc3e645a0ec17edb5 it looks like that for proper results, a specially built debug version of libc++ would have to be used in order to enable equivalent checks. This should help catch bugs like the one that commit 455a15fd fixed. Reviewed by: Sergei Golubchik
-
Thirunarayanan Balathandayuthapani authored
Problem: ======== - Partition update operation enables the bulk insert for the transaction while moving the row between partitions. This leads to debug assert failure while removing the row from one of the partition. Solution: ======== - Disallow the bulk insert operation for non-insert operation of partition table.
-
Marko Mäkelä authored
While commit 75b7cd68 was a significant improvement, we occasionally got test failures of debug builds. One of the affected tests is innodb.innodb-64k-crash.
-
- 24 Apr, 2024 3 commits
-
-
Sergei Golubchik authored
MDL wait consists of short 1 second waits (this is not configurable) repeated until lock_wait_timeout is reached. The stage is changed to Waiting and back every second. To have predictable result in the test the query should filter all sequences of X, "Waiting for MDL", X, leaving just X.
-
Sergei Golubchik authored
it allocates 1GB of memory, it causes failures in CI
-
Sergei Golubchik authored
like other galera tests do
-
- 23 Apr, 2024 5 commits
-
-
Meng-Hsiu Chiang authored
`FindZLIB` module uses variable `ZLIB_ROOT`[1] to look for libraries. By setting the variable, `FindZLIB` is able to search the libraries that installed in a non-system path (/workspace/mylib for example). And when using `z` in `LINK_LIBRARIES()` CMake tries to lookup the library in system path by default. It doesn't work if the library isn't installed in the path, and use ${ZLIB_LIBRARY} which set by FindZLIB solve the issue. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services. [1]: https://cmake.org/cmake/help/latest/module/FindZLIB.html#hints
-
Monty authored
I checked all stack overflow potential problems found with gcc -Wstack-usage=16384 and clang -Wframe-larger-than=16384 -no-inline Fixes: Added '#pragma clang diagnostic ignored "-Wframe-larger-than="' to a lot of function to where stack usage large but resonable. - Added stack check warnings to BUILD scrips when using clang and debug. Function changed to use malloc instead allocating things on stack: - read_bootstrap_query() now allocates line_buffer (20000 bytes) with malloc() instead of using stack. This has a small performance impact but this is not releant for bootstrap. - mroonga grn_select() used 65856 bytes on stack. Changed it to use malloc(). - Wsrep_schema::replay_transaction() and Wsrep_schema::recover_sr_transactions(). - Connect zipOpen3() Not fixed: - mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses 43712 byte on stack. However this is not easy to fix as the stack used is caused by a lot of code generated by defines. - Most changes in mroonga/groonga where only adding of pragmas to disable stack warnings. - rocksdb/options/options_helper.cc uses 20288 of stack space. (no reason to fix except to get rid of the compiler warning) - Causes using alloca() where the allocation size is resonable. - An issue in libmariadb (reported to connectors).
-
Marko Mäkelä authored
-
Sergei Golubchik authored
followup for 061adae9
-
Alexander Barkov authored
This problem was earlier fixed by the patch for MDEV 33344. Adding a test case only.
-
- 22 Apr, 2024 4 commits
-
-
Jan Lindström authored
Problem was assertion assuming we always hold THD::LOCK_thd_data mutex that is not true. In most cases this is true but function is also used from InnoDB lock manager and there we can't take THD::LOCK_thd_data to obey mutex ordering. Removed assertion as wsrep transaction state can't change even that case. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Alexander Barkov authored
There is a convention that Item::val_int() and Item::val_real() return SQL NULL doing effectively what this code does: null_value= true; return 0; // Always return 0 for SQL NULL This is done to optimize boolean value evaluation: if Item::val_int() or Item::val_real() returned 1 - that always means TRUE and never can means SQL NULL. This convention helps to avoid unnecessary testing Item::null_value after getting a non-zero return value. Item_func_min_max did not follow this convention. It could return a non-zero value together with null_value==true. This made evaluate_join_record() erroneously misinterpret SQL NULL as TRUE in this call: select_cond_result= MY_TEST(select_cond->val_int()); Fixing Item_func_min_max to follow the convention.
-
Markus Staab authored
-
Sergei Golubchik authored
update results
-
- 21 Apr, 2024 3 commits
-
-
Sergei Golubchik authored
in the $case=2 - it's wrong to kill after the first binlog EOF, because that might happen between INSERT(4) and INSERT(5). So, wait for the slave to acknowledge INSERT(5) before killing the master, that is, both connection threads must pass repl_semisync_master.wait_after_sync()
-
Sergei Golubchik authored
-
Sergei Golubchik authored
fixes sporadic failures under --valgrind
-
- 20 Apr, 2024 5 commits
-
-
Sergei Golubchik authored
do CHANGE MASTER before sync_with_master to have the slave in a predictable fully synced state before the next test
-
Sergei Golubchik authored
it always has to be current_thd, DBUG_SYNC asserts that. fixes sporadic SIGABRT's in binlog_encryption.rpl_parallel_slave_bgc_kill
-
Sergei Golubchik authored
-
Kristian Nielsen authored
The slave IO thread sets MYSQL_SET_CHARSET_DIR. The code for this option however is not thread-safe in sql-common/client.c. The value set is temporarily written to mysys global variable `charsets-dir` and can be seen by other threads running in parallel, which can result in use-after-free error. Problem was visible as random failures of test cases in suite multi_source with Valgrind or MSAN. Work-around by not setting this option for slave connect, it is redundant anyway as it is just setting the default value. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
Kristian Nielsen authored
The root cause of the failure is a bug in the Linux network stack: https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u If the slave does a connect(2) at the exact same time that kill -9 of the master process closes the listening socket, the FIN or RST packet is lost in the kernel, and the slave ends up timing out waiting for the initial communication from the server. This timeout defaults to --slave-net-timeout=120, which causes include/master_gtid_wait.inc to time out first and fail the test. Work-around this problem by reducing the --slave-net-timeout for this test case. If this problem turns up in other tests, we can consider reducing the default value for all tests. Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
-
- 19 Apr, 2024 2 commits
-
-
Sergei Golubchik authored
disable until fixed
-
Zhibo Zhang authored
As of version 3.2.0, OpenSSL updated the error message in new versions ("https://github.com/openssl/openssl/commit/81b741f68984"). Update the tests and result files such that they are compatible with both original and new error messages. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.
-