Commits · 0c249ad718f2ee96aaceebfa5ce21927cfa2f0d4 · nexedi / MariaDB

20 Apr, 2024 1 commit

MDEV-30232: rpl.rpl_gtid_crash fails sporadically in BB · 0c249ad7

Kristian Nielsen authored Apr 16, 2024

The root cause of the failure is a bug in the Linux network stack:

  https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u

If the slave does a connect(2) at the exact same time that kill -9 of the
master process closes the listening socket, the FIN or RST packet is lost in
the kernel, and the slave ends up timing out waiting for the initial
communication from the server. This timeout defaults to
--slave-net-timeout=120, which causes include/master_gtid_wait.inc to time
out first and fail the test.

Work-around this problem by reducing the --slave-net-timeout for this test
case. If this problem turns up in other tests, we can consider reducing the
default value for all tests.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

0c249ad7

19 Apr, 2024 3 commits

MDEV-33952 galera_create_table_as_select fails sporadically · 4a2e0345
Sergei Golubchik authored Apr 19, 2024
```
disable until fixed
```
4a2e0345

Update tests to be compatible with OpenSSL 3.2.0 · 7432a487

Zhibo Zhang authored Mar 19, 2024

As of version 3.2.0, OpenSSL updated the error message in new versions
("https://github.com/openssl/openssl/commit/81b741f68984"). Update the
tests and result files such that they are compatible with both original
and new error messages.

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services,
Inc.

7432a487

MDEV-33946: OPT_PAGE_CHECKSUM mismatch due to mtr_t::memmove() · 4c343394

Marko Mäkelä authored Apr 19, 2024

mtr_t::memmove(): Revert to the parent of
commit a032f14b
where there was supposed to be an equivalent change
that would avoid hitting a warning in some old version of GCC
when this change was part of another 10.6 based developmet branch.

For some reason, this change is not equivalent but will cause
massive amounts of backup failures in the stress tests
run by Matthias Leich, caught by
commit 4179f93d in 10.6.

4c343394

18 Apr, 2024 2 commits

MDEV-16944 postfix. Fix a typo · 2e84560d
Vladislav Vaintroub authored Apr 18, 2024

2e84560d

MDEV-32489 Change buffer index fails to delete the records · 5928e04d

mariadb-DebarunBanerjee authored Apr 16, 2024

When the change buffer records for a page span across multiple change
buffer leaf pages or the starting record is at the beginning of a page
with a left sibling, ibuf_delete_recs deletes only the records in first
page and fails to move to subsequent pages.

Subsequently a slow shutdown hangs trying to delete those left over
records.

Fix-A: Position the cursor to an user record in B-tree and exit only
when all records are exhausted.

Fix-B: Make sure we call ibuf_delete_recs during slow shutdown for
pages with IBUF entries to cleanup any previously left over records.

5928e04d

17 Apr, 2024 11 commits

MDEV-27512: Assertion !thd->transaction_rollback_request failed in rows_event_stmt_cleanup · 0ad52e4d

Brandon Nesterenko authored Apr 10, 2024

If replicating an event in ROW format, and InnoDB detects a deadlock
while searching for a row, the row event will error and rollback in
InnoDB and indicate that the binlog cache also needs to be cleared,
i.e. by marking thd->transaction_rollback_request. In the normal
case, this will trigger an error in Rows_log_event::do_apply_event()
and cause a rollback. During the Rows_log_event::do_apply_event()
cleanup of a successful event application, there is a DBUG_ASSERT in
log_event_server.cc::rows_event_stmt_cleanup(), which sets the
expectation that thd->transaction_rollback_request cannot be set
because the general rollback (i.e. not the InnoDB rollback) should
have happened already. However, if the replica is configured to skip
deadlock errors, the rows event logic will clear the error and
continue on, as if no error happened. This results in
thd->transaction_rollback_request being set while in
rows_event_stmt_cleanup(), thereby triggering the assertion.

This patch fixes this in the following ways:
 1) The assertion is invalid, and thereby removed.
 2) The rollback case is forced in rows_event_stmt_cleanup() if
transaction_rollback_request is set.

Note the differing behavior between transactions which are skipped
due to deadlock errors and other errors. When a transaction is
skipped due to an ignored deadlock error, the entire transaction is
rolled back and skipped (though note MDEV-33930 which allows
statements in the same transaction after the deadlock-inducing one
to commit). When a transaction is skipped due to ignoring a
different error, only the erroring statements are rolled-back and
skipped - the rest of the transaction will execute as normal. The
effect of this can be seen in the test results. The added test case
to rpl_skip_error.test shows that only statements which are ignored
due to non-deadlock errors are ignored in larger transactions. A
diff between rpl_temporary_error2_skip_all.result and
rpl_temporary_error2.result shows that all statements in the errored
transaction are rolled back (diff pasted below):

: diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result
49c49
< 2	1
---
> 2	NULL
51c51
< 4	1
---
> 4	NULL
53c53
< * There will be two rows in t2 due to the retry.
---
> * There will be one row in t2 because the ignored deadlock does not retry.
57d56
< 1
59c58
< 1
---
> 0

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>

0ad52e4d

MDEV-16944 Fix file sharing issues on Windows in mysqltest · 061adae9

Vladislav Vaintroub authored Apr 15, 2024

On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.

mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.

But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via

--echo exec "some text" > output_file

In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.

This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.

Internally, this command will use my_open(), and therefore retry-on-error
logic.

Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.

061adae9

Remove duplicate key "Language" from .clang-format · b48de973
Vladislav Vaintroub authored Apr 17, 2024
```
Latest Visual Studio complains about invalid format, it breaks formatting
in the IDE
```
b48de973

Do not run maria_recover_encrypted with embedded. · 173847b7

Vladislav Vaintroub authored Apr 17, 2024

It uses shutdown/restart etc, features not compatible the embedded.

also add have_debug.inc , since it uses debug_dbug variable

173847b7

Fix LTO (aka interprocedural optimization) build with MSVC · e87a175b
Vladislav Vaintroub authored Apr 10, 2024
```
Also, disable MSVC LTO for static client libraries - they won't be usable
for end-users.
```
e87a175b

MDEV-33431 Latching order violation reported fil_system.sys_space.latch and... · 040069f4

mariadb-DebarunBanerjee authored Apr 17, 2024

MDEV-33431 Latching order violation reported fil_system.sys_space.latch and ibuf_pessimistic_insert_mutex

Issue:
------
The actual order of acquisition of the IBUF pessimistic insert mutex
(SYNC_IBUF_PESS_INSERT_MUTEX) and IBUF header page latch
(SYNC_IBUF_HEADER) w.r.t space latch (SYNC_FSP) differs from the order
defined in sync0types.h. It was not discovered earlier as the path to
ibuf_remove_free_page was not covered by the mtr test. Ideal order and
one defined in sync0types.h is as follows.
SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX -> SYNC_FSP

In ibuf_remove_free_page, we acquire space latch earlier and we have
the order as follows resulting in the assert with innodb_sync_debug=on.
SYNC_FSP -> SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX

Fix:
---
We do maintain this order in other places and there doesn't seem to be
any real issue here. To reduce impact in GA versions, we avoid doing
extensive changes in mutex ordering to match the current
SYNC_IBUF_PESS_INSERT_MUTEX order. Instead we relax the ordering check
for IBUF pessimistic insert mutex using SYNC_NO_ORDER_CHECK.

040069f4

MDEV-33840 tpool- switch to longer maintainence timer interval, if pool is idle · f6e9600f

Vladislav Vaintroub authored Apr 17, 2024

Previous solution, that would entirely switch timer off, turned out
to be deadlock prone.

This patch fixed previous attempt to switch between long/short interval
periods in MDEV-24295. Now, initial state of the timer is fixed (it is ON).
Also, avoid switching timer to longer periods if there is any activity in
the pool.

f6e9600f

Revert "MDEV-33840 tpool : switch off maintenance timer when not needed." · 2ba79aba
Vladislav Vaintroub authored Apr 17, 2024
```
This reverts commit 09bae92c.
```
2ba79aba
Merge 10.4 into 10.5 · 3a3fe300
Marko Mäkelä authored Apr 17, 2024

3a3fe300
Tests: remove a duplicated check · 9164c2b8
Marko Mäkelä authored Apr 17, 2024
```
This fixes up the merge commit 9b182756
```
9164c2b8

MDEV-33895 : Galera test failure on galera_sr.MDEV-25718 · 4aeba259

Jan Lindström authored Apr 12, 2024

Test was waiting INSERT-clause to make rollback but
wait_condition was too tight. State could be
Freeing items or Rollback. Fixed wait_condition
to expect one of them.

4aeba259

16 Apr, 2024 3 commits

MDEV-33889 Read only server throws error when running a create temporary table as select statement · 41e7ceb0

Sergei Golubchik authored Apr 15, 2024

create_partitioning_metadata() should only mark transaction r/w
if it actually did anything (that is, the table is partitioned).

otherwise it's a no-op, called even for temporary tables and
it shouldn't do anything at all

41e7ceb0

Merge branch '10.4' into 10.5 · 9b182756
Oleksandr Byelkin authored Apr 16, 2024

9b182756

MDEV-33861 main.query_cache fails with embedded after enabling WITH_PROTECT_STATEMENT_MEMROOT · 50998a6c

Oleksandr Byelkin authored Apr 15, 2024

Synopsis: If SELECT returned answer from Query Cache it is not really executed.

The reason for firing of assertion
DBUG_ASSERT((mem_root->flags & ROOT_FLAG_READ_ONLY) == 0);
is that in case the query_cache is on and the same query run by different
stored routines the following use case can take place:
First, lets say that bodies of routines used by the test case are the same
and contains the only query 'SELECT * FROM t1';
call p1() -- a result set is stored in query cache for further use.
call p2() -- the same query is run against the table t1, that result in
not running the actual query but using its cached result.
On finishing execution of this routine, its memory root is
marked for read only since every SP instruction that this
routine contains has been executed.
INSERT INT t1 VALUE (1); -- force following invalidation of query cache
call p2() -- query the table t1 will result in assertion failure since its
execution would require allocation on the memory root that
has been already marked as read only memory root

The root cause of firing the assertion is that memory root of the stored
routine 'p2' was marked as read only although actual execution of the query
contained inside hadn't been performed.

To fix the issue, mark a SP instruction as not yet run in case its execution
doesn't result in real query processing and a result set got from query cache
instead.

Note that, this issue relates server built in debug mode AND with the protect
statement memory root feature turned on. It doesn't affect server built
in release mode.

50998a6c

15 Apr, 2024 5 commits

Fix windows build failure · ce104d41
Kristian Nielsen authored Apr 15, 2024
```
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
```
ce104d41
Merge from 10.4 to 10.5 · 16aa4b5f
Kristian Nielsen authored Apr 15, 2024
```
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
```
16aa4b5f

Distinguish "manager stopped" from "manager not started" · 10272f37

Kristian Nielsen authored Apr 15, 2024

This way, if manager thread somehow starts and stops again quickly before
main thread wakes up to check if it started correctly, we will not hang.

Patch suggested by Monty as follow-up to
7f498fbaSigned-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

10272f37

MDEV-33559 matched_rec::block should be allocated from the buffer pool · a032f14b

Marko Mäkelä authored Apr 15, 2024

matched_rec::rec_buf[], matched_rec::bufp: Remove.

matched_rec::block: Make this a pointer to something that
is allocated by buf_block_alloc(). In this way, the only
case where buf_block_t is constructed outside buf_pool
is ALTER TABLE...IMPORT TABLESPACE.

rtr_info::heap: Remove. This was only used for allocating matched_rec,
which now is smaller.

mtr_t::memmove(): Simplify some code to avoid GCC 9.4.0 -Wconversion
in the 10.6 branch as a result of these changes.

Reviewed by: Debarun Banerjee

a032f14b

MDEV-30676 rpl.parallel_backup* tests sometimes fail · ea810b04
Daniel Black authored Mar 06, 2024
```
Raise innodb_lock_wait_timeout from 1 to 5
```
ea810b04

14 Apr, 2024 3 commits
- MDEV-33777 Spider: Correct checks for show index column numbers · 051a1fa0
  Yuchen Pei authored Mar 27, 2024
```
It was updated for 10.6+ in MDEV-7317. Because a lower version spider
node may connect to a higher version data node, we need to change this
for 10.4 and 10.5 as well.
```
  051a1fa0
- MDEV-28993 Spider: Push down CASE statement · 18b93d6e
  Yuchen Pei authored Mar 20, 2024
  
  18b93d6e
- MDEV-28993 spider: revert removal of ITEM_FUNC_CASE_PARAMS_ARE_PUBLIC · 99dc0f03
  Yuchen Pei authored Mar 20, 2024
```
It was done in MDEV-29447.
```
  99dc0f03
13 Apr, 2024 6 commits

feedback plugin: abort sending the report on server shutdown · 8bc32410

Sergei Golubchik authored Apr 13, 2024

network timeouts might be rather large and feedback plugin
waits forever for the sender thread to exit.

an alternative could've been to use GNU-specific pthread_timedjoin_np(),
where _np mean "not portable".

8bc32410

Fixed random failure in main.kill_processlist-6619 (take 3) · 6a4ac4c7

Sergei Golubchik authored Apr 13, 2024

followup for 81f75ca8

improve over take 2. It's technically possible, though unlikely,
to see THD after it already reset the info to NULL, but has not
changed the command to COM_SLEEP yet (see THD::mark_connection_idle()).

Let's wait for "Sleep", not for NULL.

6a4ac4c7

galera/suite.pm: perl warning · 69b5fdf3
Sergei Golubchik authored Apr 11, 2024
```
Unescaped left brace in regex is passed through in regex
```
69b5fdf3

Minor improvements to options error handling · 79706fd3

Tony Chen authored Mar 08, 2024

- Add additional MTRs for more coverage on invalid options
- Updating a few error messages to be more informative
- Use the exit code from handle_options() when there is an error processing
  user options

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.

79706fd3

MDEV-33469 Fix behavior on invalid arguments · 47d75cdd

Tony Chen authored Feb 25, 2024

When passing in an invalid value (e.g. incorrect data type) for a variable, the
server startup will fail with misleading error messages.

The behavior **before** this change:

For server options:
- The error message will indicate that the argument is being adjusted to a valid value
- Server startup still fails

For plugin options:
- The error message will indicate that the argument is being adjusted to a valid value
- The plugin is still disabled
- Server startup fails with a message that it does not recognize the plugin option

The behavior **after** this change:

For server options:
- Output that an invalid argument was provided
- Exit server startup

For plugin options:
- Output that an invalid argument was provided
- Disable the plugin
- Attempt to continue server startup

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.

47d75cdd

Simplify MTR for handling multiple invalid options · dd639985

Tony Chen authored Mar 08, 2024

In 69a4d6ae, an MTR test was added to verify that we handled multiple invalid
options. However, the logic to perform this test relied on a non-trivial regex
to filter out the noise in the logs.

Instead, we now just simply search for what we expect to be in the logs.

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.

dd639985

12 Apr, 2024 1 commit

MDEV-33802 Weird read view after ROLLBACK of other transactions. · d7fc975c

Vlad Lesin authored Apr 02, 2024

In the case if some unique key fields are nullable, there can be
several records with the same key fields in unique index with at least
one key field equal to NULL, as NULL != NULL.

When transaction is resumed after waiting on the record with at least one
key field equal to NULL, and stored in persistent cursor record is
deleted, persistent cursor can be restored to the record with all key
fields equal to the stored ones, but with at least one field equal to
NULL. And such record is wrongly treated as a record with the same unique
key as stored in persistent cursor record one, what is wrong as
NULL != NULL.

The fix is to check if at least one unique field is NULL in restored
persistent cursor position, and, if so, then don't treat the record as
one with the same unique key as in the stored record key.

dict_index_t::nulls_equal was removed, as it was initially developed for
never existed in MariaDB "intrinsic tables", and there is no code, which
would set it to "true".

Reviewed by Marko Mäkelä.

d7fc975c

11 Apr, 2024 4 commits

MDEV-10684: rpl.rpl_domain_id_filter_restart fails in buildbot · a6aecbb0

Brandon Nesterenko authored Apr 11, 2024

The test failure in rpl.rpl_domain_id_filter_restart is caused by
MDEV-33887. That is, the test uses master_pos_wait() (called
indirectly by sync_slave_with_master) to try and wait for the
replica to catch up to the master. However, the waited on
transaction is ignored by the configured
  CHANGE MASTER TO IGNORE_DOMAIN_IDS=()
As MDEV-33887 reports, due to the IO thread updating the binlog
coordinates and the SQL thread updating the GTID state, if the
replica is stopped in-between these updates, the replica state will
be inconsistent. That is, the test expects that the GTID state will
be updated, so upon restart, the replica will be up-to-date.
However, if the replica is stopped before the SQL thread updates its
GTID state, then upon restart, the replica will fetch the previously
ignored event, which is no longer ignored upon restart, and execute
it. This leads to the sporadic extra row in t2.

This patch changes master_pos_wait() to use master_gtid_wait() to
ensure the replica state is consistent with the master state.

a6aecbb0

Fix g++-14 -Wtemplate-id-cdtor · 04be12a8
Marko Mäkelä authored Apr 11, 2024

04be12a8

Link beginner instructions in README.md · f131c609

anson1014 authored Apr 09, 2024

When navigating through the existing links in the README, it is not
immediately obvious where to go to find instructions in building
and testing the source code. Since the README is often the first
thing people see when looking at a repository, this information
should be front and centre so that newcomers to the project can
get setup as quickly as possible.

f131c609

Update README.md · 8785b797
Ian Gilfillan authored Apr 08, 2024

8785b797

10 Apr, 2024 1 commit

MDEV-32458 ASAN unknown-crash in Inet6::ascii_to_fbt when casting character string to inet6 · 37fd497c

Alexander Barkov authored Apr 10, 2024

The condition checked the value of the leftmost byte before checking if
at least one byte is still available in the buffer.
Changing the order in the condition: check for a byte availability before
checking the byte value.

37fd497c