Commits · d039346a7acac7c72f264377a8cd6b0273c548df · nexedi / MariaDB

27 Jan, 2024 1 commit

MDEV-4991: GTID binlog indexing · d039346a

Kristian Nielsen authored Sep 08, 2023

Improve the performance of slave connect using B+-Tree indexes on each binlog
file. The index allows fast lookup of a GTID position to the corresponding
offset in the binlog file, as well as lookup of a position to find the
corresponding GTID position.

This eliminates a costly sequential scan of the starting binlog file
to find the GTID starting position when a slave connects. This is
especially costly if the binlog file is not cached in memory (IO
cost), or if it is encrypted or a lot of slaves connect simultaneously
(CPU cost).

The size of the index files is generally less than 1% of the binlog data, so
not expected to be an issue.

Most of the work writing the index is done as a background task, in
the binlog background thread. This minimises the performance impact on
transaction commit. A simple global mutex is used to protect index
reads and (background) index writes; this is fine as slave connect is
a relatively infrequent operation.

Here are the user-visible options and status variables. The feature is on by
default and is expected to need no tuning or configuration for most users.

binlog_gtid_index
  On by default. Can be used to disable the indexes for testing purposes.

binlog_gtid_index_page_size (default 4096)
  Page size to use for the binlog GTID index. This is the size of the nodes
  in the B+-tree used internally in the index. A very small page-size (64 is
  the minimum) will be less efficient, but can be used to stress the
  BTree-code during testing.

binlog_gtid_index_span_min (default 65536)
  Control sparseness of the binlog GTID index. If set to N, at most one
  index record will be added for every N bytes of binlog file written.
  This can be used to reduce the number of records in the index, at
  the cost only of having to scan a few more events in the binlog file
  before finding the target position

Two status variables are available to monitor the use of the GTID indexes:

  Binlog_gtid_index_hit
  Binlog_gtid_index_miss

The "hit" status increments for each successful lookup in a GTID index.
The "miss" increments when a lookup is not possible. This indicates that the
index file is missing (eg. binlog written by old server version
without GTID index support), or corrupt.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

d039346a

24 Jan, 2024 1 commit
- MDEV-28861 Deprecate spider table options by comment/connection · 20741b92
  Yuchen Pei authored Jan 24, 2024
```
Also deprecating table params not implemented in MDEV-28856.
```
  20741b92
23 Jan, 2024 1 commit
- Update 11.4 HELP · d0790860
  Ian Gilfillan authored Jan 23, 2024
  
  d0790860
22 Jan, 2024 1 commit

MDEV-7850: Extend GTID Binlog Events with Thread Id · c37b2087

Brandon Nesterenko authored Jul 10, 2023

This patch augments Gtid_log_event with the user thread-id.
In particular that compensates for the loss of this info in
Rows_log_events.

Gtid_log_event::thread_id gets visible in mysqlbinlog output like

  #231025 16:21:45 server id 1  end_log_pos 537 CRC32 0x1cf1d963  GTID 0-1-2 ddl thread_id=10

as 64 bit unsigned integer.

While the size of Gtid event has grown by 8-9 bytes
replication from OLD <-> NEW is not affected by it.

This work was started by the late Sujatha Sivakumar.
Brandon Nesterenko took it over, reviewed initial patches and extended
the work.

Reviewed-by: <andrei.elkin@mariadb.com>

c37b2087

18 Jan, 2024 1 commit

MDEV-32894 mysqlbinlog flashback support binlog_row_image FULL_NODUP mode · 8bf9f218

Libing Song authored Nov 24, 2023

Summary
=======
With FULL_NODUP mode, before image inclues all columns and after
image inclues only the changed columns. flashback will swap the
value of changed columns from after image to before image.
For example:
  BI: c1, c2, c3_old, c4_old
  AI: c3_new, c4_new
flashback will reconstruct the before and after images to
  BI: c1, c2, c3_new, c4_new
  AI: c3_old, c4_old

Implementation
==============
When parsing the before and after image, position and length of
the fields are collected into ai_fields and bi_fields, if it is an
Update_rows_event and the after image doesn't includes all columns.

The changed fields are swapped between bi_fields and ai_fields.
Then it recreates the before image and after image by using
bi_fields and ai_fields. nullbit will be set to 1 if the
field is NULL, otherwise nullbit will be 0.

It also optimized flashback a little bit.
- calc_row_event_length is used instead of print_verbose_one_row
- swap_buff1 and swap_buff2 are removed.

8bf9f218

17 Jan, 2024 1 commit

MDEV-30879 Add support for up to BASE 62 to CONV() · f552febe

Andrew Hutchings authored Nov 17, 2023

BASE 62 uses 0-9, A-Z and then a-z to give the numbers 0-61. This patch
increases the range of the string functions to cover this.

Based on ideas and tests in PR #2589, but re-written into the charset
functions.

Includes fix by Sergei, UBSAN complained:
ctype-simple.c:683:38: runtime error: negation of -9223372036854775808
cannot be represented in type 'long long int'; cast to an unsigned
type to negate this value to itself
Co-authored-by: Weijun Huang <huangweijun1001@gmail.com>
Co-authored-by: Sergei Golubchik <serg@mariadb.org>

f552febe

12 Jan, 2024 1 commit

MDEV-33049 Assertion `marked_for_write_or_computed()' failed in bool · be6d48fd

Libing Song authored Dec 20, 2023

           Field_new_decimal::store_value(const my_decimal*, int*)

Analysis
========
When rpl applier is unpacking a before row image, Field::reset() will be
called before setting a field to null if null bit of the field is set in
the row image. For Field_new_decimal::reset(), it calls
Field_new_decimal::store_value() to reset the value. store_value() asserts
that the field is in the write_set bitmap since it thinks the field is
updating.

But that is not true for the row image generated in FULL_NODUP
mode. In the mode, the before image includes all fields and the after
image includes only updated fields.

Fix
===
In the case unpacking binlog row images, the assertion is meaningless.
So the unpacking field is marked in write_set temporarily to avoid the
assertion failure.

be6d48fd

10 Jan, 2024 10 commits

Merge 11.3 into 11.4 · d136169e
Marko Mäkelä authored Jan 10, 2024

d136169e
Merge 11.2 into 11.3 · af4f9dae
Marko Mäkelä authored Jan 10, 2024

af4f9dae
Merge 11.1 into 11.2 · e4cb1e32
Marko Mäkelä authored Jan 10, 2024

e4cb1e32
Merge 11.0 into 11.1 · c3a546e9
Marko Mäkelä authored Jan 10, 2024

c3a546e9
Merge 10.11 into 11.0 · c2da55ac
Marko Mäkelä authored Jan 10, 2024

c2da55ac
MDEV-26195 fixup: Remove page_no_t · 338ed5c4
Marko Mäkelä authored Jan 10, 2024

338ed5c4
Merge 10.6 into 10.11 · 1eb11da3
Marko Mäkelä authored Jan 10, 2024

1eb11da3

MDEV-33112 innodb_undo_log_truncate=ON is blocking page write · 3613fb2a

Marko Mäkelä authored Jan 10, 2024

When innodb_undo_log_truncate=ON causes an InnoDB undo tablespace
to be truncated, we must guarantee that the undo tablespace will
be rebuilt atomically: After mtr_t::commit_shrink() has durably
written the mini-transaction that rebuilds the undo tablespace,
we must not write any old pages to the tablespace.

To guarantee this, in trx_purge_truncate_history() we used to
traverse the entire buf_pool.flush_list in order to acquire
exclusive latches on all pages for the undo tablespace that
reside in the buffer pool, so that those pages cannot be written
and will be evicted during mtr_t::commit_shrink(). But, this
traversal may interfere with the page writing activity of
buf_flush_page_cleaner(). It would be better to lazily discard
the old pages of the truncated undo tablespace.

fil_space_t::is_being_truncated, fil_space_t::clear_stopping(): Remove.

fil_space_t::create_lsn: A new field, identifying the LSN of the
latest rebuild of a tablespace.

buf_page_t::flush(), buf_flush_try_neighbors(): Evict pages whose
FIL_PAGE_LSN is below fil_space_t::create_lsn.

mtr_t::commit_shrink(): Update fil_space_t::create_lsn and
fil_space_t::size right before the log is durably written and the
tablespace file is being truncated.

fsp_page_create(), trx_purge_truncate_history(): Simplify the logic.

Reviewed by: Thirunarayanan Balathandayuthapani, Vladislav Lesin
Performance tested by: Axel Schwenke
Correctness tested by: Matthias Leich

3613fb2a

MDEV-32050 fixup: Remove srv_purge_rseg_truncate_frequency · 593278f9
Marko Mäkelä authored Jan 10, 2024

593278f9

MDEV-33137: Assertion end_lsn == page_lsn failed in recv_recover_page · 4cbf75dd

Marko Mäkelä authored Jan 10, 2024

trx_purge_free_segment(), trx_purge_truncate_rseg_history():
Do not claim that the blocks will be modified in the mini-transaction,
because that will not always be the case. Whenever there is a
modification, mtr_t::set_modified() will flag it.

The debug assertion that failed in recovery is checking that all
changes to data pages are covered by log records. Due to these
incorrect calls, we would unnecessarily write unmodified data pages,
which is something that commit 05fa4558
aims to avoid.

The incorrect calls had originally been added in
commit de31ca6a (MDEV-32820) and
commit 86767bcc (MDEV-29593).

Reviewed by: Vladislav Lesin
Tested by: Elena Stepanova

4cbf75dd

09 Jan, 2024 7 commits

MDEV-33150 double-locking of LOCK_thd_kill in performance_schema.session_status · c6c2a2b8

Sergei Golubchik authored Jan 02, 2024

perfschema thread walker needs to take thread's LOCK_thd_kill to prevent
the thread from disappearing why it's being looked at.
But there's no need to lock it for the current thread.

In fact, it was harmful as some code down the stack might take
LOCK_thd_kill (e.g. set_killed() does it, and my_malloc_size_cb_func()
calls set_killed()). And it caused a bunch of mutexes being locked under
LOCK_thd_kill, which created problems later when my_malloc_size_cb_func()
called set_killed() at some unspecified point under some
random mutexes.

c6c2a2b8

cleanup: change a function, that always return 0, to void · 0a122637
Sergei Golubchik authored Jan 09, 2024

0a122637

MDEV-33031 Assertion failure upon reading from performance schema with binlog enabled · 23e107d7

Sergei Golubchik authored Dec 28, 2023

same assertion with spider. spider status variables
didn't expect to be queried from a different thread
without LOCK_thd_data.

And they didn't expect to be queried under LOCK_thd_data either
(because spider_get_trx() calls thd_set_ha_data()).

23e107d7

cleanup: spider status variables · b3065af6
Sergei Golubchik authored Dec 28, 2023
```
reduce code duplication
```
b3065af6

MDEV-33031 Assertion failure upon reading from performance schema with binlog enabled · c44cac91

Sergei Golubchik authored Dec 27, 2023

need to protect access to thread-local cache_mngr with LOCK_thd_data

technically only access from different threads has to be protected,
but this is the SHOW STATUS code path, so the difference is neglectable

c44cac91

MDEV-11777 REGEXP_REPLACE converts utf8mb4 supplementary characters to '?' · 022ae421
Sergei Golubchik authored Dec 14, 2023
```
use utf8mb4 with PCRE2, not utf8mb3
```
022ae421

Fix and stabilize testcase for MDEV-32212 · 50e02a36

Sergei Petrunia authored Jan 09, 2024

- Move it from delete.test to delete_innodb.test
- Use --source include/innodb_stable_estimates.inc to make it predicatable.

50e02a36

08 Jan, 2024 4 commits
- Merge 10.5 into 10.6 · 6538a91e
  Marko Mäkelä authored Jan 08, 2024
  
  6538a91e
- MDEV-33098: Fix some instrumentation for innodb.doublewrite_debug · 0b612619
  Marko Mäkelä authored Jan 08, 2024
```
buf_flush_page_cleaner(): A continue or break inside DBUG_EXECUTE_IF
actually is a no-op. Use an explicit call to _db_keyword_() to
actually avoid advancing the checkpoint.

buf_flush_list_now_set(): Invoke os_aio_wait_until_no_pending_writes()
to ensure that the page write to the system tablespace is completed.
```
  0b612619
- MDEV-22164 log a warning when WITHOUT VALIDATION was used · c0c1c803
  Sergei Golubchik authored Jan 08, 2024
  
  c0c1c803
- MDEV-22164 revert "make THAN optional" · 4089296a
  Sergei Golubchik authored Jan 08, 2024
  
  4089296a
05 Jan, 2024 5 commits

Merge 11.3 into 11.4 · 7ee16b1e
Marko Mäkelä authored Jan 05, 2024

7ee16b1e
Merge 11.2 into 11.3 · 193b22d8
Marko Mäkelä authored Jan 05, 2024

193b22d8
Merge 11.1 into 11.2 · f6d21a88
Marko Mäkelä authored Jan 05, 2024

f6d21a88
Merge 11.0 into 11.1 · 2edc1ad3
Marko Mäkelä authored Jan 05, 2024

2edc1ad3

MDEV-33101 Server crashes when starting the server with... · 5a58935c

mariadb-DebarunBanerjee authored Jan 05, 2024

MDEV-33101 Server crashes when starting the server with innodb-force-recovery=6 and enabling the innodb_truncate_temporary_tablespace_now variable

The issue is introduced by "MDEV-28699: Shrink temporary tablespaces
without restart". SRV_FORCE_NO_LOG_REDO forces server to read only mode
and we don't initialize temporary tablespace in read only mode.

solution: innodb_truncate_temporary_tablespace_now should be no-op in
read only mode.

5a58935c

03 Jan, 2024 7 commits

Merge 10.11 into 11.0 · 5be8b137
Marko Mäkelä authored Jan 03, 2024

5be8b137
Merge 10.6 into 10.11 · bdf65893
Marko Mäkelä authored Jan 03, 2024

bdf65893
Merge 10.5 into 10.6 · 8bd5a3de
Marko Mäkelä authored Jan 03, 2024

8bd5a3de

MDEV-33156 Crash on innodb_buf_flush_list_now=ON and innodb_force_recovery=6 · cc5c0eda

Marko Mäkelä authored Jan 03, 2024

srv_start(): Move a read only mode startup tweak from
innodb_init_params() to the correct location. Also if
innodb_force_recovery=6 we will disable the doublewrite buffer,
because InnoDB must run in read-only mode to prevent further corruption.

This change only affects debug checks. Whenever srv_read_only_mode holds,
the buf_pool.flush_list will be empty, that is, there will be no writes
of persistent InnoDB data pages.

Reviewed by: Thirunarayanan Balathandayuthapani

cc5c0eda

Merge 10.4 into 10.5 · 3a3a4f04
Marko Mäkelä authored Jan 03, 2024

3a3a4f04

MDEV-33098 The test innodb.doublewrite_debug occasionally fails to start up InnoDB · 77b8bedf

Thirunarayanan Balathandayuthapani authored Jan 03, 2024

- innodb.doublewrite_debug should avoid the checkpoint
before killing the server. So used debug sync and
innodb_flush_sync to avoid the checkpoint completely.
Test case allowed to skip on MSAN builder due to extra
checkpoint.

77b8bedf

MDEV-33157 WSREP: Fix function pointer mismatch · 96130b18

Marko Mäkelä authored Jan 03, 2024

wsrep_plugin_init(), wsrep_plugin_deinit(): Remove these dummy functions
in order to fix an error that would be flagged by cmake -DWITH_UBSAN=ON
when using clang.

wsrep_show_ready(), wsrep_show_bf_aborts(): Correct the signature.

96130b18