- 18 Sep, 2023 1 commit
-
-
Thirunarayanan Balathandayuthapani authored
- InnoDB fails to mark the page status as FREED during freeing of page for temporary tablespace. This behaviour affects scrubbing and doesn't write all zeroes in file even though pages are freed. mtr_t::free(): Mark the page as freed for temporary tablespace also
-
- 15 Sep, 2023 9 commits
-
-
Yuchen Pei authored
-
Yuchen Pei authored
-
Yuchen Pei authored
Removed some redundant hint related string literals from spd_db_conn.cc Clean up SPIDER_PARAM_*_[CHAR]LEN[S] Adding tests covering monitoring_kind=2. What it does is that it reads from mysql.spider_link_mon_servers with matching db_name, table_name, link_id, and does not do anything about that... How monitoring_* can be useful: in the deprecated spider high availability feature, when one remote fails, spider will try another remote, which apparently makes use of these table parameters. A test covering the query_cache_sync table param. Some further tests on some spider table params. Wrapper should be case insensitive. Code documentation on spider priority binary tree. Add an assertion that static_key_cardinality is always -1. All tests pass still
-
Yuchen Pei authored
This helps eliminate "server exists" failures Also, spider/bugfix.mdev_29676, when enabled after MDEV-29525 is pushed will fail because we have not --recorded the result. But the failure will only emerge when working on MDEV-31138 where we manually re-enable this test, so let's worry about that then.
-
Yuchen Pei authored
-
Yuchen Pei authored
The direct aggregate mechanism sems to be only intended to work when otherwise a full table scan query will be executed from the spider node and the aggregation done at the spider node too. Typically this happens in sub_select(). In the test spider.direct_aggregate_part direct aggregate allows to send COUNT statements directly to the data nodes and adds up the results at the spider node, instead of iterating over the rows one by one at the spider node. By contrast, the group by handler (GBH) typically sends aggregated queries directly to data nodes, in which case DA does not improve the situation here. That is why we should fix it by disabling DA when GBH is used. There are other reasons supporting this change. First, the creation of GBH results in a call to change_to_use_tmp_fields() (as opposed to setup_copy_fields()) which causes the spider DA function spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second, the spider DA function only calls direct_add() on the items, and the follow-up add() needs to be called by the sql layer code. In do_select(), after executing the query with the GBH, it seems that the required add() would not necessarily be called. Disabling DA when GBH is used does fix the bug. There are a few other things included in this commit to improve the situation with spider DA: 1. Add a session variable that allows user to disable DA completely, this will help as a temporary measure if/when further bugs with DA emerge. 2. Move the increment of direct_aggregate_count to the spider DA function. Currently this is done in rather bizarre and random locations. 3. Fix the spider_db_mbase_row creation so that the last of its row field (sentinel) is NULL. The code is already doing a null check, but somehow the sentinel field is on an invalid address, causing the segfaults. With a correct implementation of the row creation, we can avoid such segfaults.
-
Yuchen Pei authored
-
Yuchen Pei authored
Also: - clean up spider_check_and_get_casual_read_conn() and spider_check_and_set_autocommit() - remove a couple of commented out code blocks
-
Yuchen Pei authored
-
- 14 Sep, 2023 10 commits
-
-
Anel Husakovic authored
- Reviewer: <knielsen@knielsen-hq.org> <brandon.nesterenko@mariadb.com>
-
Anel Husakovic authored
- Remove extra connections in the form of `server_number_1` for the same server during initialization of servers in the `rpl_init.inc` file. - Remove disconnecting and reconnecting to the same connections, since they are not used by the test. - Update comments about the above. - Reviewer: <knielsen@knielsen-hq.org> <brandon.nesterenko@mariadb.com>
-
Anel Husakovic authored
- Fix the calling of the assertion condition when `rpl_check_server_ids` parameter is used. - Fix comments regarding the default usage and configuration files extension in this case. - Reviewer: <knielsen@knielsen-hq.org> <brandon.nesterenko@mariadb.com>
-
Marko Mäkelä authored
-
Marko Mäkelä authored
fseg_free_extent(): After fsp_free_extent() succeeded, properly mark the affected pages as freed. We failed to write FREE_PAGE records. This bug was revealed or caused by commit e938d7c1 (MDEV-32028).
-
Anel Husakovic authored
- `default_client` is included already in rpl_1slave_base.cnf`, so remove it from `my.cnf` - Remove option group for `mysqld` server as and add comment how to override specific settings for specific server - Reviewer: <brandon.nesterenko@mariadb.com>
-
Yuchen Pei authored
This function trivially returns false
-
Yuchen Pei authored
-
Marko Mäkelä authored
-
Marko Mäkelä authored
This fixes up commit 6cc88c3d Thanks to Markus Mäkelä for reporting the build failure.
-
- 13 Sep, 2023 5 commits
-
-
Brandon Nesterenko authored
The SQL thread and a user connection executing SHOW SLAVE STATUS have a race condition on Last_SQL_Errno, such that a slave which previously errored and stopped, on its next start, SHOW SLAVE STATUS can show that the SQL Thread is running while the previous error is also showing. The fix is to move when the last error is cleared when the SQL thread starts to occur before setting the status of Slave_SQL_Running. Thanks to Kristian Nielson for his work diagnosing the problem! Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com> Kristian Nielson <knielsen@knielsen-hq.org>
-
Brandon Nesterenko authored
- Removed commented out and unused lines. - Updated test to reference true failure of timeout rather than deadlock - Switched save variables from MTR to user - Forced relay-log purge to not potentially re-execute an already prepared transaction
-
Daniel Black authored
Remove TLSv1.1 from the default tls_version system variable. Output a warning if TLSv1.0 or TLSv1.1 are selected. Thanks Tingyao Nian for the feature request.
-
Sergei Golubchik authored
-
Oleg Smirnov authored
There is a list of plugins in the WiX configuration file for HeidiSQL, and the installer only installs DLLs from that list although the HeidiSQL portable archive may include other plugins. This commit adds client_ed25519.dll to this list and also rearranges the list alphabetically, so it is easier to verify its contents
-
- 12 Sep, 2023 4 commits
-
-
Marko Mäkelä authored
buf_read_page_low(): Use 64-bit arithmetics when computing the file byte offset. In other calls to fil_space_t::io() the offset was being computed correctly, for example by buf_page_t::physical_offset().
-
Marko Mäkelä authored
trx_purge_truncate_history(): Remove a debug assertion that had originally been added in commit 0de3be8c (MDEV-30671). In trx_t::commit_empty() we do not have any efficient way to rewind rseg.needs_purge to an accurate value that would satisfy this debug assertion. Note: No correctness property should be violated here. At the point where the debug assertion was located, we had already established that purge_sys.sees(rseg.needs_purge) holds, that is, it is safe to remove everything from rseg.
-
Marko Mäkelä authored
trx_undo_reuse_cached(): Assert that this is being invoked on the persistent rollback segment of the transaction, and remove dead code that was handling cached temporary undo log. This was missed in commit 51e62cb3 (MDEV-26782).
-
sjaakola authored
MariaDB async replication SQL thread was stopped for any failure in applying of replication events and error message logged for the failure was: "Node has dropped from cluster". The assumption was that event applying failure is always due to node dropping out. With optimistic parallel replication, event applying can fail for natural reasons and applying should be retried to handle the failure. This retry logic was never exercised because the slave SQL thread was stopped with first applying failure. To support optimistic parallel replication retrying logic this commit will now skip replication slave abort, if node remains in cluster (wsrep_ready==ON) and replication is configured for optimistic or aggressive retry logic. During the development of this fix, galera.galera_as_slave_nonprim test showed some problems. The test was analyzed, and it appears to need some attention. One excessive sleep command was removed in this commit, but it will need more fixes still to be fully deterministic. After this commit galera_as_slave_nonprim is successful, though. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
- 11 Sep, 2023 11 commits
-
-
Julius Goryavsky authored
-
Daniele Sciascia authored
- Deterministic test to reproduce the warning - Update wsrep-lib to fix the issue Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Jan Lindström authored
Test case is starting too many servers that are not really needed for original problem testing. This fix reduces number of servers to make test case smaller and more robust. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Jan Lindström authored
Problem was that if wsrep_notify_cmd was set it was called with a new status "joined" it tries to connect to the server to update some table, but the server isn't initialized yet, it's not listening for connections. So the server waits for the script to finish, script waits for mariadb client to connect, and the client cannot connect, because the server isn't listening. Fix is to call script only when Galera has already formed a view or when it is synched or donor. This fix also enables following test cases: * galera.MW-284 * galera.galera_binlog_checksum * galera_var_notify_ssl_ipv6 Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
-
Thirunarayanan Balathandayuthapani authored
- Lifetime of temporary tables is expected to be short, it would seem to make sense to assume that all temporary tablespace pages will remain in the buffer pool. It doesn't make sense to have read-ahead for pages of temporary tablespace
-
Marko Mäkelä authored
buf_flush_page_cleaner(): Before finishing a batch, wake up any threads that are waiting for buf_pool.done_flush_LRU. This should fix a hung shutdown that we observed after SET GLOBAL innodb_buffer_pool_size started was executed to shrink the InnoDB buffer pool.
-
Marko Mäkelä authored
Starting with commit 4ff5311d log_write_up_to(trx->commit_lsn, true) in DDL operations could end up being a no-op, because trx->commit_lsn would be 0. trx_flush_log_if_needed(): Revert an incorrect attempt to ensure that DDL operations are crash-safe. trx_t::commit(std::vector<pfs_os_file_t> &), ha_innobase::rename_table(): Set trx_t::flush_log_later so that trx_t::commit_in_memory() will retain trx_t::commit_lsn for the final durability call. Tested by: Matthias Leich
-
Marko Mäkelä authored
lock_wait(): Never return the transient error code DB_LOCK_WAIT. In commit 78a04a4c (MDEV-29869) some assignments assign trx->error_state = DB_SUCCESS were removed, and it was possible that the field was left at its initial value DB_LOCK_WAIT. The test case for this is nondeterministic; without this fix, it would only occasionally fail. Reviewed by: Vladislav Lesin
-
Marko Mäkelä authored
MDEV-32096 Parallel replication lags because innobase_kill_query() may fail to interrupt a lock wait lock_sys_t::cancel(trx_t*): Remove, and merge to its only caller innobase_kill_query(). innobase_kill_query(): Before reading trx->lock.wait_lock, do acquire lock_sys.wait_mutex, like we did before commit e71e6133 (MDEV-24671). In this way, we should not miss a recently started lock wait by the killee transaction. lock_rec_lock(): Add a DEBUG_SYNC "lock_rec" for the test case. lock_wait(): Invoke trx_is_interrupted() before entering the wait, in case innobase_kill_query() was invoked some time earlier and some longer-running operation did not check for interrupts. As suggested by Vladislav Lesin, do not overwrite trx->error_state==DB_INTERRUPTED with DB_SUCCESS. This would avoid a call to trx_is_interrupted() when the test is modified to use the DEBUG_SYNC point lock_wait_start instead of lock_rec. Avoid some redundant loads of trx->lock.wait_lock; cache the value in the local variable wait_lock. Deadlock::check_and_resolve(): Take wait_lock as a parameter and return wait_lock (or -1 or nullptr). We only need to reload trx->lock.wait_lock if lock_sys.wait_mutex had been released and reacquired. trx_t::error_state: Correctly document the data member. trx_lock_t::was_chosen_as_deadlock_victim: Clarify that other threads may set the field (or flags in it) while holding lock_sys.wait_mutex. Thanks to Johannes Baumgarten for reporting the problem and testing the fix, as well as to Kristian Nielsen for suggesting the fix. Reviewed by: Vladislav Lesin Tested by: Matthias Leich
-
Marko Mäkelä authored
-
Marko Mäkelä authored
Some s390x environments include https://github.com/madler/zlib/pull/410 and a more pessimistic compressBound: (sourceLen * 16 + 2308) / 8 + 6. Let us adjust the recently enabled tests accordingly.
-