Commits · 17e0e9e79e8c36bf8e89583ad49e791ac1e76219 · nexedi / MariaDB

31 Jul, 2024 9 commits
- Merge branch 'bb-10.11-release' into bb-11.1-release · 17e0e9e7
  Oleksandr Byelkin authored Jul 31, 2024
  
  17e0e9e7
- Merge branch 'bb-10.6-release' into bb-10.11-release · 7ab216c5
  Oleksandr Byelkin authored Jul 31, 2024
  
  7ab216c5
- Merge branch 'bb-10.11-release' into bb-11.1-release · d337aa28
  Oleksandr Byelkin authored Jul 31, 2024
  
  d337aa28
- fix fix · 9ca05a09
  Oleksandr Byelkin authored Jul 31, 2024
  
  9ca05a09
- Merge branch 'bb-10.6-release' into bb-10.11-release · 363b5baa
  Oleksandr Byelkin authored Jul 31, 2024
  
  363b5baa
- fix · 48226881
  Oleksandr Byelkin authored Jul 31, 2024
  
  48226881
- Merge branch '10.6' into 10.11 · e6f11675
  Oleksandr Byelkin authored Jul 31, 2024
  
  e6f11675
- Merge branch '11.1' into bb-11.1-release · 2728a319
  Oleksandr Byelkin authored Jul 31, 2024
  
  2728a319
- Merge branch '10.5' into 10.6 · 0063a8c9
  Oleksandr Byelkin authored Jul 31, 2024
  
  0063a8c9
30 Jul, 2024 6 commits

MDEV-34625 Fix undefined behavior of using uninitialized member variables · 811614d4

Hugo Wen authored Jul 19, 2024

Commit a8a75ba2 causes the MariaDB server to crash, usually with signal
11, at random code locations due to invalid pointer values during any
table operation. This issue occurs when the server is built with -O3 and
other customized compiler flags.

For example, the command `use db1;` causes server to crash in the
`check_table_access` function at line sql_parse.cc:7080 because
`tables->correspondent_table` is an invalid pointer value of 0x1.

The crashes are due to undefined behavior from using uninitialized
variables. The problematic commit a8a75ba2 introduces code that
allocates memory and sets it to 0 using thd->calloc before initializing
it with a placement new operation.
This process depends on setting memory to 0 to initialize member
variables not explicitly set in the constructor. However, the compiler
can optimize out the memset/bfill, leading to uninitialized values and
unpredictable issues.

Once a constructor function initializes an object, any uninitialized
variables within that object are subject to undefined behavior. The
state of memory before the constructor runs, whether it involves
memset or was used for other purposes, is irrelevant after the
placement new operation.

This behavior can be demonstrated with this
[test](https://gcc.godbolt.org/z/5n87z1raG) I wrote to examine the
assembly code. The code in MariaDB can be abstracted to the following,
though it has many layers wrapped around it and more complex logic,
causing slight differences in optimization in the MariaDB build.
To summarize, on x86, the memset in the following code is optimized out
with both -O2 and -O3 in GCC 13, and is only preserved in the much older
GCC 4.9.

    struct S {
      int i;     // uninitialized in consturctor
      S() {};
    };
    int bar() {
      void *buf = malloc(sizeof(S));
      memset(buf, 0, sizeof(S));       // optimized out
      S* s = new(buf) S;
      return s->i;
    }

With GCC13 -O3:

    bar():
          sub     rsp, 8
          mov     edi, 4
          call    malloc
          mov     eax, DWORD PTR [rax]
          add     rsp, 8
          ret

With GCC4.9 -O3

    bar():
          sub     rsp, 8
          mov     edi, 4
          call    malloc
          mov     DWORD PTR [rax], 0
          xor     eax, eax
          add     rsp, 8
          ret

Now we ensure the constructor initializes variables correctly by running
the reset() function in the constructor to perform the memset/bfill(0)
operation. After applying the fix, the crash is gone.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.

811614d4

MDEV-34580: Assertion `(key_part->key_part_flag & 4) == 0' failed key_hashnr · fdda8171

Sergei Petrunia authored Jul 30, 2024

Remove an assert added by fix for MDEV-34417. BNL-H join can be used with
prefix keys. This happens when there are real prefix indexes on the
equi-join columns (although it probably doesn't make a lot of sense).

Anyway, remove the assert. The code receives properly truncated key values
for hashing/comparison so it can handle them just fine.

fdda8171

MDEV-34357 InnoDB: Assertion failure in file ./storage/innobase/page/page0zip.cc line 4211 · ee5f7692

Thirunarayanan Balathandayuthapani authored Jul 23, 2024

During InnoDB root page split, InnoDB does the following
1) First move the root records to the new page(p1)
2) Empty the root, insert the node pointer to the root page
3) Split the new page and make it as child nodes.
4) Finds the split record, allocate another new page(p2)
to the index
5) InnoDB stores the record(ret) predecessor to the supremum
record of the page (p2).
6) In page_copy_rec_list_start(), move the records from p1 to p2
upto the split record
6) Given table is a compressed row format page, InnoDB attempts to
compress the page p2 and failed (due to innodb_compression_level = 0)
7) Since the compression fails, InnoDB gets the number of preceding
records(ret_pos) of a record (ret) on the page (p2)
8) Page (p2) is a new page, ret points to infimum record.
ret_pos can be 0. InnoDB have wrong condition that ret_pos shouldn't
be 0 and returns corruption. InnoDB has similar wrong check in
page_copy_rec_list_end()

ee5f7692

MDEV-34422 Corrupted ib_logfile0 due to uninitialized log_sys.lsn_lock · 1c8af2ae

Marko Mäkelä authored Jul 30, 2024

In commit bf0b82d2 (MDEV-33515)
the function log_t::init_lsn_lock() was removed. This was fine on
those platforms where InnoDB uses futex-based mutexes (Linux, FreeBSD,
OpenBSD, NetBSD, DragonflyBSD).

Dave Gosselin debugged this on Apple macOS and submitted a fix where
pthread_mutex_wrapper::pthread_mutex_wrapper() would invoke init().
We do not really need that; we only need to invoke lsn_lock.init()
like we used to do before commit bf0b82d2.
This should be a no-op for the futex based mutexes, which intentionally
rely on zero initialization.

The missing pthread_mutex_init() call would cause race conditions
and corruption of log_sys.buf because multiple threads could
apparently hold log_sys.lsn_lock concurrently in
log_t::append_prepare().  The error would be caught by a debug
assertion in log_t::write_buf(), or in non-debug builds by the
fact that the server cannot be restarted due to an apparently
missing FILE_CHECKPOINT record (because it had been written
to wrong offset in log_sys.buf).

The failure in log_t::append_prepare() was caught on Microsoft Windows
after enabling SUX_LOCK_GENERIC and therefore forcing the use of
pthread_mutex_wrapper for the log_sys.lsn_lock.  It appears to be fine
to omit the pthread_mutex_init() call on GNU/Linux.

log_t::create(): Invoke lsn_lock.init().

log_t::close(): Invoke lsn_lock.destroy().

To better catch this kind of issues in the future by simply defining
SUX_LOCK_GENERIC on any platform, a separate debug instrumentation patch
will be applied to the 10.6 branch later.

Reviewed by: Debarun Banerjee

1c8af2ae

MDEV-34181 Instant table aborts after discard tablespace · c038b3c0

Thirunarayanan Balathandayuthapani authored Jul 30, 2024

- commit 85db5347 (MDEV-33400)
retains the instantness in the table definition after discard
tablespace. So there is no need to assign n_core_null_bytes
during instant table preparation unless they are not
initialized.

c038b3c0

MDEV-33087 ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently · cc8eefb0

Thirunarayanan Balathandayuthapani authored Jul 30, 2024

- During copy algorithm, InnoDB should use bulk insert operation
for row by row insert operation. By doing this, copy algorithm
can effectively build indexes. This optimization is disabled
for temporary table, versioning table and table which has
foreign key relation.

Introduced the variable innodb_alter_copy_bulk to allow
the bulk insert operation for copy alter operation
inside InnoDB. This is enabled by default

ha_innobase::extra(): HA_EXTRA_END_ALTER_COPY mode tries to apply
the buffered bulk insert operation, updates the non-persistent
table stats.

row_merge_bulk_t::write_to_index(): Update stat_n_rows after
applying the bulk insert operation

row_ins_clust_index_entry_low(): In case of copy algorithm,
switch to bulk insert operation.

copy_data_error_ignore(): Handles the error while copying
the data from source to target file.

cc8eefb0

29 Jul, 2024 5 commits

MDEV-34506 2nd execution name resolution problem with pushdown into unions · 48b256a7

Rex authored Jul 02, 2024

Statements affected by this bug need all the following to be true
1) a derived table table or view whose specification contains a set
     operation at the top level.
2) a grouping operator (group by/having) operating on a column alias
     other than in the first select of the union/intersect
3) an outer condition that will be pushed into all selects in this
     union/intersect, either into the where or having clause

When pushing a condition into all selects of a unit with more than one
select, pushdown_cond_for_derived() renames items so we can re-use the
condition being pushed.
These names need to be saved and reset for correct name resolution on
second execution of prepared statements.

Reviewed by Igor Babaev (igor@mariadb.com)

48b256a7

MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities · 4bf7c966

Monty authored Jul 27, 2024

(With trivial fixes by sergey@mariadb.com)
Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs

Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int
in InnoDB that in effect doubles the Cardinality for secondary keys.
This has the biggest effect for indexes where a few rows has the same key
value. Using this may also cause table scans for very small tables (which
in some cases may be better than an index scan).

The user visible effect is that 'SHOW INDEX FROM table_name' will for
InnoDB show the true Cardinality (and not 2x the real value). It will
also allow the optimizer to chose a better index in some cases as the
division by 2 could have a bad effect for tables with 2-5 identical values
per key.

A few notes about using fix_innodb_cardinality:
- It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX
  will also update the statistics in table share.
- The effect of fix_innodb_cardinality for query plans or EXPLAIN
  is only visible after first open of the table. This is why one must
  do a flush tables or use SHOW INDEX for the option to take effect.
- Using fix_innodb_cardinality can thus affect all user in their query
  plans if they are using the same tables.

Because of this, it is strongly recommended that one uses
optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly
in configuration files to not cause issues for other users.

4bf7c966

MDEV-34502 fixup: Do not cripple MSAN · 7e5c9ccd

Marko Mäkelä authored Jul 29, 2024

We need to work around deficiencies of Valgrind, and apparently
the previous work-around attempts
(such as d247d649) do not work
anymore, definitely not on recent clang-based compilers.

MemorySanitizer should be fine; unfortunately we set HAVE_valgrind for it
as well.

7e5c9ccd

MDEV-34458: Remove more traces of BTR_MODIFY_PREV · 7ead48a7

Marko Mäkelä authored Jul 29, 2024

In commit 2f6df937
we fixed an observed case of the bug by removing
some code related to the no longer needed
BTR_MODIFY_PREV mode.

In commit 73ad436e
an alternative fix was applied that also fixes the
BTR_SEARCH_PREV case.

Let us clean up some implicit references to BTR_MODIFY_PREV
that were missed in 2f6df937.

btr_pcur_move_backward_from_page(): Assume that the latch mode was
BTR_SEARCH_LEAF.

btr_pcur_move_to_prev(): Assert that the latch mode is BTR_SEARCH_LEAF.
This function is mostly invoked in row0sel.cc for read operations,
as well as in row0merge.cc for reading from the clustered index.
All callers indeed use a cursor in the BTR_SEARCH_LEAF mode.

7ead48a7

MDEV-34565: SIGILL due to OS not supporting AVX512 · 232d7a5e

Marko Mäkelä authored Jul 29, 2024

It is not sufficient to check that the CPU supports the necessary
instructions. Also the operating system (or virtual machine hypervisor)
must enable all the AVX registers to be saved and restored on a
context switch.

Because clang 8 does not support the compiler intrinsic _xgetbv()
we will require clang 9 or later for enabling the use of VPCLMULQDQ
and the related AVX512 features.

232d7a5e

27 Jul, 2024 1 commit
- MDEV-19052 main.win postfix --view-protocol compat · 0939bfc0
  Daniel Black authored Jul 27, 2024
```
Correct compatibility with view-protocol.

Thanks Lena Startseva
```
  0939bfc0
25 Jul, 2024 1 commit

MDEV-19052 Range-type window frame supports only numeric datatype · 77885935

Daniel Black authored Jun 14, 2024

When there is no bounds on the upper or lower part of the window,
it doesn't matter if the type is numeric.

It also doesn't matter how many ORDER BY items there are in the
query.

Reviewers: Sergei Petrunia and Oleg Smirnov

77885935

24 Jul, 2024 2 commits
- The test should be not for AddressSanitizer used becouse stack check tests · 26f31bdd
  Oleksandr Byelkin authored Jul 24, 2024
```
and this check switched off
```
  26f31bdd
- disabling view protcol untill fix · 28448957
  Oleksandr Byelkin authored Jul 24, 2024
  
  28448957
23 Jul, 2024 3 commits

MDEV-34066 Output of SHOW ENGINE INNODB STATUS uses the nanoseconds suffix for microseconds · 3359ac09
Thirunarayanan Balathandayuthapani authored Jul 23, 2024
```
- This issue is caused by commit e71e6133
(MDEV-24671). Change the output of transaction lock wait
time in microseconds suffix.
```
3359ac09

MDEV-34634 Types mismatch when cloning items causes debug assertion · c91aeb37

Oleg Smirnov authored Jul 23, 2024

New runtime diagnostic introduced with MDEV-34490 has detected
that `Item_int_with_ref` incorrectly returns an instance of its ancestor
class `Item_int`. This commit fixes that.

In addition, this commit reverts a part of the diagnostic related
to `clone_item()` checks. As it turned out, `clone_item()` is not required
to return an object of the same class as the cloned one. For example,
look at `Item_param::clone_item()`: it can return objects of `Item_null`,
`Item_int`, `Item_string`, etc, depending on the object state.
So the runtime type diagnostic is not applicable to `clone_item()` and
is disabled with this commit.

As the similar diagnostic failures are expected to appear again
in the future, this commit introduces a new test file in the main suite:
item_types.test, and new test cases may be added to this file

Reviewer: Oleksandr Byelkin <sanja@mariadb.com>

c91aeb37

Merge branch '10.11' into 11.1 · 1de570d7
Oleksandr Byelkin authored Jul 23, 2024

1de570d7

22 Jul, 2024 3 commits
- MDEV-15393 post-push: complete rpl_mysqldump_gtid_slave_pos fixes. · c944cd6f
  Andrei authored Jul 22, 2024
```
Added a missed
  --source include/save_master_gtid.inc
by the previous commit.
```
  c944cd6f
- MDEV-33971 fix --view-protocol test failure · 216fdb15
  Dave Gosselin authored Jul 17, 2024
```
Allow the NAME_CONST unwrap optimization when the client is not
in the PREPARE step of prepared statement nor in the view
analysis mode.
```
  216fdb15
- Merge branch '10.6' into 10.11 · 0fe39d36
  Oleksandr Byelkin authored Jul 20, 2024
  
  0fe39d36
20 Jul, 2024 1 commit
- Merge branch '10.5' into 10.6 · a938503c
  Oleksandr Byelkin authored Jul 20, 2024
  
  a938503c
19 Jul, 2024 5 commits

MDEV-15393 gtid_slave_pos duplicate key errors after mysqldump restore · b8f92ade

Andrei authored Jul 15, 2024

When mysqldump is run to dump the `mysql` system database, it generates
INSERT statements into the table `mysql.gtid_slave_pos`.
After running the backup script
those inserts did not produce the expected gtid state on slave. In
particular the maximum of mysql.gtid_slave_pos.sub_id did not make
into
   rpl_global_gtid_slave_state.last_sub_id

an in-memory object that is supposed to match the current state of the
table. And that was regardless of whether --gtid option was specified
or not. Later when the backup recipient server starts as slave
in *non-gtid* mode this desychronization may lead to a duplicate key
error.

This effect is corrected for --gtid mode mysqldump/mariadb-dump only
as the following.  The fixes ensure the insert block of the dump
script is followed with a "summing-up" SET @global.gtid_slave_pos
assignment.

For the implemenation part, note a deferred print-out of
SET-gtid_slave_pos and associated comments is prefered over relocating
of the entire blocks if (opt_master,slave_data &&
do_show_master,slave_status) ...  because of compatiblity
concern. Namely an error inside do_show_*() is handled in the new code
the same way, as early as, as before.

A regression test can be run in how-to-reproduce mode as well.
One affected mtr test observed.
rpl_mysqldump_slave.result "mismatch" shows now the new deferring print
of SET-gtid_slave_pos policy in action.

b8f92ade

new libfmt 11.0.1 · 0f6f1114
Oleksandr Byelkin authored Jul 19, 2024

0f6f1114
New columnstore 23.10.2 · 88711ee5
Oleksandr Byelkin authored Jul 19, 2024

88711ee5
New CC 3.3 · a94fd874
Oleksandr Byelkin authored Jul 19, 2024

a94fd874
Fix view protocol · b8b6cab2
Oleksandr Byelkin authored Jul 19, 2024

b8b6cab2

18 Jul, 2024 2 commits
- Merge branch '10.5' into 10.6 · 9af2caca
  Oleksandr Byelkin authored Jul 18, 2024
  
  9af2caca
- Additional tests for MDEV-28345 ASAN: use-after-poison or unknown-crash in... · 9dafde57
  Alexander Barkov authored Jul 18, 2024
```
Additional tests for MDEV-28345 ASAN: use-after-poison or unknown-crash in my_strtod_int from charset_info_st::strntod or test_if_number
```
  9dafde57
17 Jul, 2024 2 commits

MDEV-33921: Fix rpl_xa_empty_transaction.test · a061ae10

Brandon Nesterenko authored Jul 17, 2024

The test was missing a save_master_gtid.inc on the master,
leading to the slave thinking it was in sync after executing
sync_with_master_gtid.inc, despite not having executed the
latest transaction. This skipped transaction, XA COMMIT,
was supposed to error-to-be-ignored because its XID could not
be found, but be thrown out because the replication filters
would filter out the target database. However, if the slave
was able to stop before executing the transaction, then
the replication filer is reset (to empty), and when the
slave is later restarted, that transactions error would
no longer be ignored.

Additionally, as the test cases added in MDEV-33921 rely
on GTID synchronization, the test cases now force
master_use_gtid=slave_pos for consistency

a061ae10

MDEV-34353 Revert "don't wait indefinitely for signal handler in --bootstrap" · 36b867ad
Sergei Golubchik authored Jul 09, 2024
```
This reverts commit 938b9293. Not needed after 90d376e0.
```
36b867ad