Commits · ee974ca5e0f00e928873179c5dfc2f040ab5aefd · nexedi / MariaDB

19 Jun, 2024 4 commits

MDEV-31658 : Deadlock found when trying to get lock during applying · ee974ca5

Jan Lindström authored May 13, 2024

Problem was that there was two non-conflicting local idle
transactions in node_1 that both inserted a key to primary key.
Then two transactions from other nodes inserted also
a key to primary key so that insert from node_2 conflicted
one of the local transactions in node_1 so that there would
be duplicate key if both are committed. For this insert
from other node tries to acquire S-lock for this record
and because this insert is high priority brute force (BF)
transaction it will kill idle local transaction.

Concurrently, second insert from node_3 conflicts the second
idle insert transaction in node_1. Again, it tries to acquire
S-lock for this record and kills idle local transaction.

At this point we have two non-conflicting high priority
transactions holding S-lock on different records in node_1.
For example like this: rec s-lock-node2-rec s-lock-node3-rec rec.

Because these high priority BF-transactions do not wait
each other insert from node3 that has later seqno compared
to insert from node2 can continue. It will try to acquire
insert intention for record it tries to insert (to avoid
duplicate key to be inserted by local transaction). Hower,
it will note that there is conflicting S-lock in same gap
between records. This will lead deadlock error as we have
defined that BF-transactions may not wait for record lock
but we can't kill conflicting BF-transaction because
it has lower seqno and it should commit first.

BF-transactions are executed concurrently because their
values to primary key are different i.e. they do not
conflict.

Galera certification will make sure that inserts from
other nodes i.e these high priority BF-transactions
can't insert duplicate keys. Local transactions naturally
can but they will be killed when BF-transaction
acquires required record locks.

Therefore, we can allow situation where there is conflicting
S-lock and insert intention lock regardless of their seqno
order and let both continue with no wait. This will lead
to situation where we need to allow BF-transaction
to wait when lock_rec_has_to_wait_in_queue is called
because this function is also called from
lock_rec_queue_validate and because lock is waiting
there would be assertion in ut_a(lock->is_gap()
|| lock_rec_has_to_wait_in_queue(cell, lock));

lock_wait_wsrep_kill
  Add debug sync points for BF-transactions killing
  local transaction.

wsrep_assert_no_bf_bf_wait
  Print also requested lock information

lock_rec_has_to_wait
  Add function to handle wsrep transaction lock wait
  cases.

lock_rec_has_to_wait_wsrep
  New function to handle wsrep transaction lock wait
  exceptions.

lock_rec_has_to_wait_in_queue
  Remove wsrep exception, in this function all
  conflicting locks need to wait in queue.
  Conflicts between BF and local transactions
  are handled in lock_wait.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>

ee974ca5

MDEV-34178: Enable spinloop for index_lock · 5b26a076

Marko Mäkelä authored Jun 19, 2024

In an I/O bound concurrent INSERT test conducted by Mark Callaghan,
spin loops on dict_index_t::lock turn out to be beneficial.

This is a mixed bag; enabling the spin loops will improve throughput
and latency on some workloads and degrade in others.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
Performance tested by: Axel Schwenke

5b26a076

MDEV-34178: Improve the spin loops · f8d213bd

Marko Mäkelä authored Jun 19, 2024

srw_mutex_impl<spinloop>::wait_and_lock(): Invoke srw_pause() and
reload the lock word on each loop. Thanks to Mark Callaghan for
suggesting this.

ssux_lock_impl<spinloop>::rd_wait(): Actually implement a spin loop
on the rw-lock component without blocking on the mutex component.
If there is a conflict with wr_lock(), wait for writer.lock to be
released without actually acquiring it.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich

f8d213bd

MDEV-34178: Improve PERFORMANCE_SCHEMA instrumentation · 6cde03ae

Marko Mäkelä authored Jun 19, 2024

When MariaDB is built with PERFORMANCE_SCHEMA support enabled
and with futex-based rw-locks (not srw_lock_), we were unnecessarily
releasing and reacquiring lock.writer in srw_lock_impl::psi_wr_lock()
and ssux_lock::psi_wr_lock().

If there is a conflict with rd_lock(), let us hold the lock.writer
and execute u_wr_upgrade() to wait for rd_unlock().

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich

6cde03ae

18 Jun, 2024 1 commit

MDEV-34178: Simplify the U lock · 2bd661ca

Marko Mäkelä authored Jun 18, 2024

The U lock mode of the sux_lock that was introduced in
commit 03ca6495 (MDEV-24142)
is unnecessarily complex.

Internally, sux_lock comprises two parts, each with their own wait queue
inside the operating system kernel: a mutex and a rw-lock.

We can map the operations as follows:

x_lock(): (X,X)
u_lock(): (X,_)
s_lock(): (_,S)

The Update lock mode, which is mutually exclusive with itself and with
X (exclusive) locks but not with shared (S) locks, was unnecessarily
acquiring a shared lock on the second component. The mutual exclusion
is guaranteed by the first component.

We might simplify the #ifdef SUX_LOCK_GENERIC case further by omitting
srw_mutex_impl::lock, because it is kind-of duplicating the mutex
that we will use for having a wait queue. However, the predicate
buf_page_t::can_relocate() would depend on the predicate
is_locked_or_waiting(), which is not available for pthread_mutex_t.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich

2bd661ca

17 Jun, 2024 3 commits

MDEV-34014 mysql_upgrade failed · 83d3ed49

Alexander Barkov authored Jun 17, 2024

Adding a new statement into scripts/sys_schema/before_setup.sql:

  ALTER DATABASE sys CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci;

to fix db.opt in case:
- the database `sys` was altered to unexpected CHARACTER SET or COLLATE values
- or db.opt was erroneously removed

to make sure that sys objects are always recreated using utf8mb3_general_ci.

83d3ed49

MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters · ef9e3e73

Sergei Petrunia authored Jun 11, 2024

(Variant for 10.6: return error code from SQL_SELECT::test_quick_select)
The optimizer deals with Rowid Filters this way:

1. First, range optimizer is invoked. It saves information
   about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
   index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.

The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.

Fixed by checking if query was killed.

ef9e3e73

Merge 10.5 into 10.6 · e60acae6
Marko Mäkelä authored Jun 17, 2024

e60acae6

16 Jun, 2024 2 commits

Change mysqldump to use DO instead of 'SELECT' for storing sequences. · 956bcf8f
Monty authored Jun 15, 2024
```
This avoids a lot of SETVAL() results when applying a mysqldump with
sequences.
```
956bcf8f

MDEV-34406 Enhance mariadb_upgrade to print failing query in case of error · fef32fd9

Monty authored Jun 15, 2024

To make this possible, it was also necessary to enhance the mariadb
client with the option --print-query-on-error.
This option can also be very useful when running a batch of queries
through the mariadb client and one wants to find out where things goes
wrong.

TODO: It would be good to enhance mariadb_upgrade to not call the mariadb
client for executing queries but instead do this internally.  This
would have made this patch much easier!

Reviewed by: Sergei Golubchik <serg@mariadb.com>

fef32fd9

14 Jun, 2024 2 commits

MDEV-34297 fixup: -Wconversion on 32-bit · 4b4c371f
Marko Mäkelä authored Jun 14, 2024

4b4c371f

MDEV-34381 During innodb_undo_truncate=ON recovery, InnoDB may fail to shrink undo* files · 3271588b

Thirunarayanan Balathandayuthapani authored Jun 14, 2024

- During recovery, InnoDB may fail to shrink the undo tablespaces
when there are no pages to recover while applying the redo log.
This issue exists only when innodb_undo_truncate is enabled.
trx_lists_init_at_db_start() could've applied the redo logs
for undo tablespace page0.

3271588b

13 Jun, 2024 3 commits
- Merge 10.5 into 10.6 · 32202c30
  Marko Mäkelä authored Jun 13, 2024
  
  32202c30
- MDEV-33840: Fix GCC -Wreorder · c849952b
  Marko Mäkelä authored Jun 13, 2024
```
This fixes up the merge commit 829cb1a4
```
  c849952b
- MDEV-33161 fixup: CMAKE_CXX_FLAGS=-DEXTRA_DEBUG · dd13243b
  Marko Mäkelä authored Jun 13, 2024
  
  dd13243b
12 Jun, 2024 2 commits

MDEV-34365: UBSAN runtime error: call to function io_callback(tpool::aiocb*) · d3a7e46b

Brandon Nesterenko authored Jun 11, 2024

On an UBSAN clang-15 build, if running with UBSAN option
halt_on_error=1 (the issue doesn't show up without it),
MTR fails during mysqld --bootstrap with UBSAN error:

call to function io_callback(tpool::aiocb*) through pointer to incorrect function type 'void (*)(void *)'

This patch corrects the parameter type of io_callback
to match its expected type defined by callback_func,
i.e. (void*).

Reviewed By:
============
<TODO>

d3a7e46b

Merge 10.5 into 10.6 · fc9005ad
Marko Mäkelä authored Jun 12, 2024

fc9005ad

11 Jun, 2024 1 commit

MDEV-33616 workaround libmariadb bug : mysql_errno = 0 on failed connection · f2eda615

Vladislav Vaintroub authored Jun 10, 2024

The bug can happens on macOS, if server closes the socket without sending
error packet to client. Closing the socket on server side is legitimate,
and happen e.g when write timeout occurs, perhaps also other situations.

However mysqltest is not prepared to handle mysql_errno 0, and erroneously
thinks connection was successfully established.

The fix/workaround in mysqltest is to treat client failure with
mysql_errno 0 the same as CR_SERVER_LOST (generic client-side
communication error)

The real fix in client library would ensure that mysql_errno is set
on errors.

f2eda615

10 Jun, 2024 9 commits

MDEV-34002 Initialise fields in spider_db_handler · d524cb5b
Yuchen Pei authored Jun 04, 2024
```
Otherwise it may result in nonsensical values like 190 for a boolean.
```
d524cb5b
fix the test for --view · 40dd5b86
Sergei Golubchik authored Jun 10, 2024

40dd5b86

MDEV-34129 mariadb-install-db appears to hang on macOS · 90d376e0

Dave Gosselin authored May 13, 2024

Immediately close down the signal handler loop when we decide to
break connections as it's the start of process termination
anyway, and there's no need to wait once we've invoked break_connections.

90d376e0

MDEV-34355: rpl.rpl_semi_sync_no_missed_ack_after_add_slave ‘server_3 should have sent…’ · fcd21d3e

Brandon Nesterenko authored Jun 10, 2024

The problem is that the test could query the status variable
Rpl_semi_sync_slave_send_ack before the slave actually updated it.
This would result in an immediate --die assertion killing the rest
of the test. The bottom of this commit message has a small patch
that can be applied to reproduce the test failure.

This patch fixes the test failure by waiting for the variable to be
updated before querying its value.

diff --git a/sql/semisync_slave.cc b/sql/semisync_slave.cc
index 9ddd4c5c8d7..60538079fce 100644
--- a/sql/semisync_slave.cc
+++ b/sql/semisync_slave.cc
@@ -303,7 +303,10 @@ int Repl_semi_sync_slave::slave_reply(Master_info *mi)
     reply_res= DBUG_EVALUATE_IF("semislave_failed_net_flush", 1,
                                 net_flush(net));
     if (!reply_res)
+    {
+      sleep(1);
       rpl_semi_sync_slave_send_ack++;
+    }
   }
   DBUG_RETURN(reply_res);
 }

fcd21d3e

mtr --skip-not-found did not skip suites · 3b80d23d

Alexander Barkov authored Jun 02, 2024

--skip-not-found switch tells mtr to skip not found tests instead of aborting.
But it failed to skip the test if the suite name was not found.

This problem also made the *last-N-failed builbot builders fail
to run `mtr --skip-not-found` if the last commit removed a file in
the mysql-test/include/ directory.

This commit fixes it, now the not found test is properly skipped,
no matter what component of the test name was not found:

$ ./mtr main.foo --skip-not-found foo.main
...
==============================================================================
TEST                                  WORKER RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
foo.main                                 [ skipped ]  not found
main.foo                                 [ skipped ]  not found
--------------------------------------------------------------------------

3b80d23d

Merge 10.5 into 10.6 · 27834ebc
Marko Mäkelä authored Jun 10, 2024

27834ebc

MDEV-33161 Function pointer signature mismatch in LF_HASH · a2bd936c

Marko Mäkelä authored Jun 10, 2024

In cmake -DWITH_UBSAN=ON builds with clang but not with GCC,
-fsanitize=undefined will flag several runtime errors on
function pointer mismatch related to the lock-free hash table LF_HASH.

Let us use matching function signatures and remove function pointer
casts in order to avoid potential bugs due to undefined behaviour.

These errors could be caught at compilation time by
-Wcast-function-type-strict, which is available starting with clang-16,
but not available in any version of GCC as of now. The old GCC flag
-Wcast-function-type is enabled as part of -Wextra, but it specifically
does not catch these errors.

Reviewed by: Vladislav Vaintroub

a2bd936c

MDEV-34227 On startup: UBSAN: runtime error: applying non-zero offset in... · 246c0b3a

Alexander Barkov authored Jun 10, 2024

MDEV-34227 On startup: UBSAN: runtime error: applying non-zero offset in JOIN::make_aggr_tables_info in sql/sql_select.cc

Avoid undefined behaviour (applying offset to nullptr).
The reported scenario is covered in mysql-test/connect-no-db.test
No new tests needed.

246c0b3a

MDEV-32376 SHOW CREATE DATABASE statement crashes the server when db name... · 21f56583

Alexander Barkov authored Jun 10, 2024

MDEV-32376 SHOW CREATE DATABASE statement crashes the server when db name contains some unicode characters, ASAN stack-buffer-overflow

Adding the test for the length of lex->name into show_create_db().

Without this test writes beyond the end of db_name_buff were possible
upon a too long database name.

21f56583

09 Jun, 2024 1 commit

MDEV-34237: On Startup: UBSAN: runtime error: call to function... · bf0aa99a

Brandon Nesterenko authored Jun 06, 2024

MDEV-34237: On Startup: UBSAN: runtime error: call to function MDL_lock::lf_hash_initializer lf_hash_insert through pointer to incorrect function type 'void (*)(st_lf_hash *, void *, const void *)'

A few different incorrect function type UBSAN issues have been
grouped into this patch.

The only real potentially undefined behavior is an error about
show_func_mutex_instances_lost, which when invoked in
sql_show.cc::show_status_array(), puts 5 arguments onto the stack;
however, the implementing function only actually has 3 parameters (so
only 3 would be popped). This was fixed by adding in the remaining
parameters to satisfy the type mysql_show_var_func.

The rest of the findings are pointer type mismatches that wouldn't
lead to actual undefined behavior. The lf_hash_initializer function
type definition is

typedef void (*lf_hash_initializer)(LF_HASH *hash, void *dst, const void *src);

but the MDL_lock and table cache's implementations of this function
do not have that signature. The MDL_lock has specific MDL object
parameters:

static void lf_hash_initializer(LF_HASH *hash __attribute__((unused)),
                                MDL_lock *lock, MDL_key *key_arg)

and the table cache has specific TDC parameters:

static void tdc_hash_initializer(LF_HASH *,
                                 TDC_element *element, LEX_STRING *key)

leading to UBSAN runtime errors when invoking these functions.

This patch fixes these type mis-matches by changing the
implementing functions to use void * and const void * for their
respective parameters, and later casting them to their expected
type in the function body.

Note too the functions tdc_hash_key and tc_purge_callback had
a similar problem to tdc_hash_initializer and was fixed
similarly.

Reviewed By:
============
Sergei Golubchik <serg@mariadb.com>

bf0aa99a

07 Jun, 2024 8 commits

MDEV-34269: post-fix code simplification · 0d85c905

Julius Goryavsky authored Jun 07, 2024

The code is slightly simplified taking into account
the fact that partition_ht() always returns a normal
hton when there is no partitioning.

0d85c905

MDEV-34269 : 10.11.8 cluster becomes inconsistent when using composite primary key and partitioning · 01728879

Jan Lindström authored Jun 06, 2024

This is regression from commit 3228c08f. Problem is that
when table storage engine is determined there should be
check is table partitioned and if it is then determine
partition implementing storage engine.

Reported bug is reproducible only with --log-bin so make
sure tests changed by 3228c08f and new test are run
with --log-bin and binlog disabled.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>

01728879

MDEV-34266 safe_strcpy() includes an unnecessary conditional branch · e255837e

Marko Mäkelä authored May 30, 2024

The strncpy() wrapper that was introduced in
commit 567b6812
is checking whether the output was truncated even in cases
where the caller does not care about it.

Let us introduce a separate function safe_strcpy_truncated() that
indidates whether the output was truncated.

e255837e

MDEV-34169 Don't allow innodb_open_files to be lesser than · 4b4dbb23

Thirunarayanan Balathandayuthapani authored Jun 07, 2024

              number of non-user tablespace.

fil_space_t::try_to_close(): Don't try to close
the tablespace which is acquired by the caller of
the function

Added the suppression message in open_files_limit test case

4b4dbb23

MDEV-34203 Sandbox mode \- is not compatible with --binary-mode · 77c4c0f2
Oleksandr Byelkin authored Jun 07, 2024
```
"Process" sandbox short command put by masqldump to avoid an error.
```
77c4c0f2

MDEV-34321: call to crc32c_3way through pointer to incorrect function type · d9d0e8fd

Marko Mäkelä authored Jun 07, 2024

In commit 9ec7819c the CRC-32 function
signatures had been unified somewhat, but not enough.

clang -fsanitize=undefined would flag a function pointer signature
mismatch between const char* and const void*, but not between
uint32_t and unsigned. We try to fix both inconsistencies anyway.

Reviewed by: Vladislav Vaintroub

d9d0e8fd

MDEV-34169 Don't allow innodb_open_files to be lesser than · b7a75fbb

Thirunarayanan Balathandayuthapani authored Jun 07, 2024

		number of non-user tablespace.

- InnoDB only closes the user tablespace when the number of open
files exceeds innodb_open_files limit. In that case, InnoDB should
make sure that innodb_open_files value should be greater
than number of undo tablespace, system and temporary tablespace files.

b7a75fbb

Merge 10.5 into 10.6 · a687cf86
Marko Mäkelä authored Jun 07, 2024

a687cf86

06 Jun, 2024 4 commits

MDEV-32158: wsrep_sst_mariabackup use /tmp dir during SST rather then user defined tmpdir · 238798d9

Julius Goryavsky authored Jun 06, 2024

wsrep_sst_mariabackup should use the tmpdir defined by
the user under the '[mysqld]' section of the configuration
file rather than the default '/tmp' directory.

238798d9

galera: wsrep-lib submodule update · 654f6ece
Julius Goryavsky authored Jun 06, 2024

654f6ece

mtr: сhange the default setting for the port group size parameter · c2d97620

Julius Goryavsky authored Jun 06, 2024

Some galera tests starts 6 galera nodes. Each galera node requires
three ports: 6*3 = 18. Plus 6 ports are needed for 6 mariadbd servers.
Since the number of ports is rounded up to 10 everywhere in mtr, we
will take 30 as the default value for the port group size parameter.

c2d97620

MDEV-33523 Spurious deadlock error when wsrep_on=OFF · c1dc0397

Daniele Sciascia authored Mar 28, 2024

Avoid starting transactions in wsrep-lib side when wsrep is
disabled. It is unnecessary, and causes spurious deadlock errors on
transaction clean up.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>

c1dc0397