Commits · 10.6-MDEV-31949-gtid_prepare_fail_paths · nexedi / MariaDB

18 Oct, 2023 3 commits

MDEV-31949: rpl_xa_prepare_gtid_fail deterministic paths · acb9c9e9

Brandon Nesterenko authored Oct 18, 2023

happy_xac is where the XA COMMIT completes before noticing the
error signalled by the prior XAP

sad_xac is where the XA COMMIT notices the error signalled by
the prior XAC and rolls back, leaving a dangling XAP.

acb9c9e9

MDEV-31949: Fix rpl_xa_prepare_gtid_fail · bd37485a

Brandon Nesterenko authored Oct 17, 2023

rpl.rpl_xa_prepare_gtid_fail would sporadically fail, as an XA COMMIT,
running concurrently with a failed prepare, could sometimes complete
succesfully, or be rolled back due to failure of prior commit. The
test expected the completion case, but if it failed, the xa transaction
would be left in a prepared state with a lock on its table. This then
created a lock time-out during test cleanup, as it tries to drop that
table.

The fix is to extend the test to check if the transaction is still
prepared, and silently commit it if so, thus releasing the locks.
The non-determinism is fine (i.e. DEBUG_SYNC isn't needed to force
one path), as the verification performed by the test has already
completed. This is just for cleanup.

bd37485a

MDEV-31949 parallel slave xa Round-Robin distribution · 96bd9e6b

Andrei authored Aug 19, 2023

XA-Prepare group of events

  XA START xid
  ...
  XA END xid
  XA PREPARE xid

and its XA-"complete" terminator

  XA COMMIT or
  XA ROLLBACK

are made distributed Round-Robin across slave parallel workers.
The former hash-based policy was proven to attribute to execution
latency through creating a big - many times larger than the size
of the worker pool - queue of binlog-ordered transactions
to commit.

Acronyms and notations used below:

  XAP := XA-Prepare event or the whole prepared XA group of events
  XAC := XA-"complete", which is a solitary group of events
  |W| := the size of the slave worker pool
  Subscripts like `_k' denote order in a corresponding sequence
     (e.g binlog file).

KEY CHANGES:

The parallel slave
------------------
driver thread now maintains a list XAP:s currently
in processing. It's purpose is to avoid "wild" parallel execution of XA:s
with duplicate xids (unlikely, but that's the user's right).
The list is arranged as a sliding window with the size of 2*|W| to account
a possibility of XAP_k -> XAP_k+2|W|-1 the largest (in the group-of-events
count sense) dependency.
Say k=1, and |W| the # of Workers is 4. As transactions are distributed
Round-Robin, it's possible to have T^*_1 -> T^*_8 as the largest
dependency ('*' marks the dependents) in runtime.
It can be seen from worker queues, like in the picture below.
Let Q_i worker queues  develop downward:

  Q1 ...  Q4
  1^* 2 3 4
  5   6 7 8^*

Worker # 1 has assigned with T_1 and T_5.
Worker #4 can take on its T_8 when T_1 is yet at the
beginning of its processing, so even before XA START of that XAP.

XA related
----------
XID_cache_element is extended with two pointers to resolve
two types of dependencies: the duplicate xid XAP_k -> XAP_k+i
and the ordinary completion on the prepare XAP_k -> XAC_k+j.
The former is handled by a wait-for-xid protocol conducted by
xid_cache_delete() and xid_cache_insert_maybe_wait().
The later is done analogously by xid_cache_search_maybe_wait() and
slave_applier_reset_xa_trans().

XA-"complete" are allowed to go forward before its XAP parent
has released the xid (all recovery concerns are covered in MDEV-21496,
MDEV-21777).
Yet XAC is going to wait for it at a critical
point of execution which is at "complete" the work in Engine.

CAVEAT: storage/innobase/trx/trx0undo.cc changes are due to possibly
        fixed MDEV-32144,
	TODO: to be verified.

Thanks to Brandon Nesterenko at mariadb.com for initial review and
a lot of creative efforts to advance with this work!

96bd9e6b

16 Oct, 2023 2 commits
- MDEV-28122 Optimize table crash while applying online log · ee5cadd5
  Thirunarayanan Balathandayuthapani authored Oct 16, 2023
```
- InnoDB fails to check the overflow buffer while applying
the operation to the table that was rebuilt. This is caused
by commit 3cef4f8f (MDEV-515).
```
  ee5cadd5
- Post fix for MDEV-32449 · cca95478
  Monty authored Oct 16, 2023
  
  cca95478
14 Oct, 2023 2 commits

MDEV-32449 Server crashes in Alter_info::add_stat_drop_index upon CREATE TABLE · 1c554459

Monty authored Oct 14, 2023

Fixed missing initialization of Alter_info()

This could cause crashes in some create table like scenarios
where some generated indexes where automatically dropped.

I also added a test that we do not try to drop from index_stats for
temporary tables.

1c554459

Do not create histograms for single column unique key · ec277a70

Monty authored Oct 14, 2023

The intentention was always to not create histograms for single value
unique keys (as histograms is not useful in this case), but because of
a bug in the code this was still done.

The changes in the test cases was mainly because hist_size is now NULL
for these kind of columns.

ec277a70

13 Oct, 2023 2 commits

MDEV-32272 lock_release_on_prepare_try() does not release lock if supremum bit... · 18fa00a5

Vlad Lesin authored Oct 12, 2023

MDEV-32272 lock_release_on_prepare_try() does not release lock if supremum bit is set along with other bits set in lock's bitmap

The error is caused by MDEV-30165 fix with the following commit:
d13a57ae

There is logical error in lock_release_on_prepare_try():

        if (supremum_bit)
          lock_rec_unlock_supremum(*cell, lock);
        else
          lock_rec_dequeue_from_page(lock, false);

Because there can be other bits set in the lock's bitmap, and the lock
type can be suitable for releasing criteria, but the above logic
releases only supremum bit of the lock.

The fix is to release lock if it suits for releasing criteria and unlock
supremum if supremum is locked otherwise.

Tere is also the test for the case, which was reported by QA team. I
placed it in a separate files, because it requires debug build.

Reviewed by: Marko Mäkelä

18fa00a5

MDEV-31098 InnoDB Recovery doesn't display encryption message when no... · cbad0bcd

Thirunarayanan Balathandayuthapani authored Oct 13, 2023

MDEV-31098  InnoDB Recovery doesn't display encryption message when no encryption configuration passed

- InnoDB fails to report the error when encryption configuration
wasn't passed. This patch addresses the issue by adding
the error while loading the tablespace and deferring the
tablespace creation.

cbad0bcd

10 Oct, 2023 4 commits
- MDEV-32388 MSAN / Valgrind errors in Item_func_like::get_mm_leaf upon query from partitioned table · 8bf17c57
  Monty authored Oct 10, 2023
```
The problem was that RANGE_OPT_PARAM was not completely initialized in
some cases.
Added bzero() to ensure that all elements are always initialized.
```
  8bf17c57
- Removed warning from ssl_cipher.test · 55534a26
  Monty authored Oct 10, 2023
  
  55534a26
- MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data · b159f05a
  Monty authored Oct 07, 2023
```
Fixed hang when renaming index to original name
```
  b159f05a
- Remember first error in Dummy_error_handler · fdcb443e
  Monty authored Oct 05, 2023
```
Use Dummy_error_handler in open_stat_tables() to ignore all errors
when opening statistics tables.
```
  fdcb443e
08 Oct, 2023 1 commit

Fix merge commit : No test file or result files should be executable · 8941bdc4

Otto Kekalainen authored Oct 07, 2023

In commit 5ea5291d @sanja-byelkin for unknown reason switched the file mode
for 3 Galera tzinfo related test files from 644 -> 755. This exists only
from branch 10.6 onward:

    $ git checkout 10.5
    $ find mysql-test -executable -name *.test -or -executable -name *.result
    (no results)
    $ git checkout 10.6
    $ find mysql-test -executable -name *.test -or -executable -name *.result
    mysql-test/suite/galera/t/mysql_tzmysql-test/suite/galera/t/mysql_tzinfo_to_sql.test
    mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
    mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.resultinfo_to_sql.test

mysql-test/suite/galera/t/mariadb_tzinfo_to_sql.test
mysql-test/suite/galera/r/mariadb_tzinfo_to_sql.result

No test file nor test result file should be executable, so run chmod -x
on them.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.

8941bdc4

06 Oct, 2023 4 commits

Merge 10.5 into 10.6 · 625a150a
Marko Mäkelä authored Oct 06, 2023

625a150a
MDEV-32364 Server crashes when starting server with high innodb_log_buffer_size · 6e9b421f
Marko Mäkelä authored Oct 06, 2023
```
log_t::create(): Return whether the initialisation succeeded.
It may fail if too large an innodb_log_buffer_size is specified.
```
6e9b421f

MDEV-32361 mariadb-backup --move-back leaves out ib_logfile0 · 0e0a19b9

Marko Mäkelä authored Oct 06, 2023

copy_back(): Also copy the dummy empty ib_logfile0 so that
MariaDB Server 10.8 or later can be started after
--copy-back or --move-back.

Thanks to Daniel Black for reporting this.

This is a 10.5 version of
commit ebf36492

0e0a19b9

Fix GCC 13.2.0 -Wmismatched-new-delete · 10a368d3

Marko Mäkelä authored Oct 06, 2023

Table_cache_instance::operator new[](size_t): Reverted the changes
that were made in commit 8edef482
and move them to the only caller.

10a368d3

04 Oct, 2023 2 commits

MDEV-31095 tpool - do not create new worker, if thread creation is pending. · 9e62ab7a

Vladislav Vaintroub authored Apr 25, 2023

Use an std::atomic_flag to track thread creation in progress.
This is mainly a cleanup, the effect of this change was not measureable
in my tests.

9e62ab7a

MDEV-31095 tpool - restrict threadpool concurrency during bufferpool load · e33e2fa9

Vladislav Vaintroub authored Apr 25, 2023

Add threadpool functionality to restrict concurrency during "batch"
periods (where tasks are added in rapid succession).
This will throttle thread creation more agressively than usual, while
keeping performance at least on-par.

One of these cases is bufferpool load, where async read IOs are executed
without any throttling. There can be as much as 650K read IOs for
loading 10GB buffer pool.

Another one is recovery, where "fake read" IOs are executed.

Why there are more threads than we expect?
Worker threads are not be recognized as idle, until they return to the
standby list, and to return to that list, they need to acquire
mutex currently held in the submit_task(). In those cases, submit_task()
has no worker to wake, and would create threads until default concurrency
level (2*ncpus) is satisfied. Only after that throttling would happen.

e33e2fa9

03 Oct, 2023 9 commits

MDEV-32164 Server crashes in JOIN::cleanup after erroneous query with view · 9ba8dc14

Michael Widenius authored Sep 27, 2023

The problem was that we did not handle errors properly in
JOIN::get_best_combination. In case an early error, JOIN->join_tab would
contain unintialized values, which would cause errors on cleanup().

The error in question was reported earlier, but not noticed until later.
One cause of this is that most of the sql_select.cc code just checks
thd->fatal_error and not thd->is_error().
Fixed by changing of checks of fatal_error to is_error().

9ba8dc14

Change SEL_ARG::MAX_SEL_ARGS to a user defined variable optimizer_max_sel_args · d4347177

Monty authored Sep 27, 2023

This allows a user to to change the default value of MAX_SEL_ARGS (16000)
in the rare case where they neeed more generated SEL_ARGS (as part of
the range optimizer)

d4347177

MDEV-32203 Raise notes when an index cannot be used on data type mismatch · 4e9322e2

Monty authored Sep 20, 2023

Raise notes if indexes cannot be used:
- in case of data type or collation mismatch (diferent error messages).
- in case if a table field was replaced to something else
  (e.g. Item_func_conv_charset) during a condition rewrite.

Added option to write warnings and notes to the slow query log for
slow queries.

New variables added/changed:

- note_verbosity, with is a set of the following options:
  basic            - All old notes
  unusable_keys    - Print warnings about keys that cannot be used
                     for select, delete or update.
  explain          - Print unusable_keys warnings for EXPLAIN querys.

The default is 'basic,explain'. This means that for old installations
the only notable new behavior is that one will get notes about
unusable keys when one does an EXPLAIN for a query. One can turn all
of all notes by either setting note_verbosity to "" or setting sql_notes=0.

- log_slow_verbosity has a new option 'warnings'. If this is set
  then warnings and notes generated are printed in the slow query log
  (up to log_slow_max_warnings times per statement).

- log_slow_max_warnings   - Max number of warnings written to
                            slow query log.

Other things:
- One can now use =ALL for any 'set' variable to set all options at once.
  For example using "note_verbosity=ALL" in a config file or
  "SET @@note_verbosity=ALL' in SQL.
- mysqldump will in the future use @@note_verbosity=""' instead of
  @sql_notes=0 to disable notes.
- Added "enum class Data_type_compatibility" and changing the return type
  of all Field::can_optimize*() methods from "bool" to this new data type.

Reviewer & Co-author: Alexander Barkov <bar@mariadb.com>
- The code that prints out the notes comes mainly from Alexander

4e9322e2

Give warnings if open_stat_table_for_ddl() fails · 4c8d2410

Monty authored Sep 24, 2023

The warning is given in case of table not found or if there is a lock
timeout. The warning is needed as in case of a lock timeout then the
persistent table stats are going to be wrong.

4c8d2410

Change BUILD scripts to use with-ssl=system · 684f7f81
Monty authored Sep 20, 2023

684f7f81

Correction of recent PR in mroonga for 10.6 code · 5910bc1f

Monty authored Sep 19, 2023

Updated ha_mroonga::storage_check_if_supported_inplace_alter to support
new ALTER TABLE flags.

This fixes failing tests:
mroonga/storage.alter_table_add_index_unique_duplicated
mroonga/storage.alter_table_add_index_unique_multiple_column_duplicated

5910bc1f

Changed some malloc() calls to my_malloc() · 8edef482
Monty authored Sep 09, 2023
```
- hostnames in hostname_cache added
- Some Galera (WSREP) allocations
- Table caches
```
8edef482

Added Myisam, Aria and InnoDB buffer pool to @@memory_used status variable · c4a5bd1e

Monty authored Sep 07, 2023

This makes it easier to see how much memory MariaDB server has allocated.
(For all memory allocations that goes through mysys)

c4a5bd1e

MDEV-31957 Concurrent ALTER and ANALYZE collecting statistics can result in stale statistical data · e3b36b8f

Monty authored Aug 18, 2023

Example of what causes the problem:
T1: ANALYZE TABLE starts to collect statistics
T2: ALTER TABLE starts by deleting statistics for all changed fields,
    then creates a temp table and copies data to it.
T1: ANALYZE ends and writes to the statistics tables.
T2: ALTER TABLE renames temp table in place of the old table.

Now the statistics from analyze matches the old deleted tables.

Fixed by waiting to delete old statistics until ALTER TABLE is
the only one using the old table and ensure that rename of columns
can handle swapping of column names.

rename_columns_in_stat_table() (former rename_column_in_stat_tables())
now takes a list of columns to rename. It uses the following algorithm
to update column_stats to be able to handle circular renames

- While there are columns to be renamed and it is the first loop or
  last rename loop did change something.
  - Loop over all columns to be renamed
    - Change column name in column_stat
      - If fail because of duplicate key
      - If this is first change attempt for this column
         - Change column name to a temporary column name
         - If there was a conflicting row, replace it with the current row.
    else
     - Remove entry from column list

- Loop over all remaining columns in the list
 - Remove the conflicting row
 - Change column from temporary name to final name in column_stat

Other things:
- Don't flush tables for every operation. Only flush when all updates
  are done.
- Rename of columns was not handled in case of ALGORITHM=copy (old bug).
  - Fixed that we do not collect statistics for hidden hash columns
    used by UNIQUE constraint on long values.
  - Fixed that we do not collect statistics for blob columns referred by
    generated virtual columns. This was achieved by storing the fields for
    which we want to have statistics in table->has_value_set instead of
    in table->read_set.
- Rename of indexes was not handled for persistent statistics.
  - This is now handled similar as rename of columns. Renamed columns
    are now stored in 'rename_stat_indexes' and handled in
    Alter_info::delete_statistics() together with drooped indexes.
- ALTER TABLE .. ADD INDEX may instead of creating a new index rename
  an existing generated foreign key index. This was not reflected in
  the index_stats table because this was handled in
  mysql_prepare_create_table instead instead of in the mysql_alter() code.
  Fixed by adding a call in mysql_prepare_create_table() to drop the
  changed index.
  I also had to change the code that 'marked the index' to be ignored
  with code that would not destroy the original index name.

Reviewer: Sergei Petrunia <sergey@mariadb.com>

e3b36b8f

29 Sep, 2023 1 commit

MDEV-32257 dangling XA-rollback in binlog from emtpy XA in pseudo_slave_mode · 388296a1

Andrei authored Sep 28, 2023

In `pseudo_slave_mode=1` aka "psedo-slave" mode any prepared XA
transaction disconnects from the user session, as if the user
connection drops. The xid of such transaction remains in the server,
and should the prepared transaction be read-only, it is marked.
The marking makes sure that the following termination of the
read-only transaction  ends up with ER_XA_RBROLLBACK.
This did not take place actually for  `pseudo_slave_mode=1` read-only.

Fixed with checking the read-only status of a prepared transaction
at time it disconnects from the `pseudo_slave_mode=1` session, to mark
its xid when that's the case.

388296a1

27 Sep, 2023 2 commits
- MDEV-32232 mysql_install_db_win.test fails on second execution · 29e7f53b
  Vladislav Vaintroub authored Sep 22, 2023
```
Fix test
```
  29e7f53b
- MDEV-27943 - reduce calls to mysql_socket_set_thread_owner() in threadpool · a2a08eda
  Vladislav Vaintroub authored Sep 22, 2023
```
It is enough to just once, during connection phase
```
  a2a08eda
25 Sep, 2023 3 commits

MDEV-30217 : Assertion `mode_ == m_local || transaction_.is_streaming()'... · 076df87b

Jan Lindström authored Sep 15, 2023

MDEV-30217 : Assertion `mode_ == m_local || transaction_.is_streaming()' failed in int wsrep::client_state::bf_abort(wsrep::seqno)

Problem was that brute force (BF) thread requested conflicting lock
and was trying to kill victim transaction, but this victim
was also brute force thread. However, this victim was not actually
holding conflicting lock, instead both brute force transaction
and victim transaction were had insert intention locks.
We should not kill brute force victim transaction if requesting
lock does not need to wait.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>

076df87b

Merge branch '10.4' into 10.5 · 6b343de8
Yuchen Pei authored Sep 25, 2023

6b343de8

MDEV-31936 Simplify deinit_spider.inc · b6773f58

Yuchen Pei authored Aug 17, 2023

Spider is part of the server, and there's no need to check the
version.

All spider plugins are uninstalled in clean_up_spider.inc

DROP SERVER IF EXISTS makes things easier

b6773f58

24 Sep, 2023 1 commit

MDEV-32225 Test case from opt_tvc.test fails with statement memory protection · ab9b1461

Igor Babaev authored Sep 23, 2023

Memory for type holders of the columns of a table value constructor must
be allocated only once.

Approved by Oleksandr Byelkin <sanja@mariadb.com>

ab9b1461

22 Sep, 2023 4 commits

MDEV-32228 speedup opening tablespaces on Windows · 1ee0d09a

Vladislav Vaintroub authored Sep 22, 2023

is_file_on_ssd() is more expensive than it should be.
It caches the results by volume name, but still calls GetVolumePathName()
every time, which, as procmon shows, opens multiple directories in
filesystem hierarchy (db directory, datadir, and all ancestors)

The fix is to cache SSD status by volume serial ID, which is cheap to
retrieve with GetFileInformationByHandleEx()

1ee0d09a

Merge 10.5 into 10.6. · d13a57ae
Vlad Lesin authored Sep 22, 2023

d13a57ae
MDEV-31742 incorrect examined rows in case of stored function usage · 89a493d6
Oleksandr Byelkin authored Jul 19, 2023
```
The counter is global so we do not need add backup to it
if we do not zero it after taking the backup.
```
89a493d6
MDEV-30820 slow log Rows_examined out of range · 2bf291ba
Oleksandr Byelkin authored Jul 13, 2023
```
Fix row counters to be able to get any possible value.
```
2bf291ba