Commits · e3af2c9ffcdac2e20466c35fb414abf7ef424135 · nexedi / MariaDB

04 Mar, 2021 2 commits

Merge MDEV-25029 for testing · e3af2c9f
Marko Mäkelä authored Mar 04, 2021

e3af2c9f

MDEV-25029: Reduce dict_sys mutex contention for read-only workload · d231d813

Krunal Bauskar authored Mar 03, 2021

In theory, read-only workload shouldn't have anything to do with
dict_sys mutex. dict_sys mutex is meant to protect database metadata.

But then why does dict_sys mutex shows up as part of a read-only workload?
This workload needs to fetch stats for query processing and while reading
these stats dict_sys mutex is taken.

Is this really needed?
No. For the traditional reasons, it was the default global mutex used.

Based on 10.6 changes, flow can now use table->lock_mutex to protect
update/access of these stats. table mutexes being table specific
global contention arising out of dict_sys is reduced.

Thanks to Marko Makela for his early suggestion around proposed alternative
and review of the draft patch.

d231d813

03 Mar, 2021 3 commits

Merge 10.5 into 10.6 · 71d30d01
Marko Mäkelä authored Mar 03, 2021

71d30d01

MDEV-25016 Race condition between lock_sys_t::cancel() and page split or merge · 1c7d4f8d

Marko Mäkelä authored Mar 03, 2021

In commit 8d16da14 (MDEV-24789)
we accidentally introduced a race condition. During the time a
waiting lock request is being removed, the request might be
moved to another page due to a concurrent page split or merge.
To prevent this, we must hold exclusive lock_sys.latch when releasing
a record lock.

lock_release_autoinc_locks(): Avoid a potential hang.
No dict_table_t::lock_mutex must be waited for while already holding
lock_sys.wait_mutex or trx_t::mutex.

lock_cancel_waiting_and_release(): Correctly handle AUTO_INCREMENT locks.

1c7d4f8d

MDEV-24863 AHI entries mismatch with the index while reloading the evicted tables. · 4b166ca9

Thirunarayanan Balathandayuthapani authored Feb 25, 2021

This is after-merge fix of f33e57a9.
In btr_search_drop_page_hash_index(), InnoDB should take
the exclusive lock on the AHI latch if index is already
freed to avoid the freed memory access during buf_pool_resize()

4b166ca9

02 Mar, 2021 6 commits

MDEV-24973 Performance schema duplicates rarely executed code for mutex operations · 80ac9ec1

Marko Mäkelä authored Mar 02, 2021

The PERFORMANCE_SCHEMA wrapper for mutex and rw-lock operations is
causing a lot of unlikely code to be inlined in each invocation.
The impact of this may have been emphasized in MariaDB 10.6, because
InnoDB now uses the common implementation of mutexes and condition
variables (MDEV-21452).

By default, we build with cmake -DPLUGIN_PERFSCHEMA enabled,
but at runtime no instrumentation will be enabled. Similar to
commit eba2d10a
we had better avoid inlining the rarely executed code in order to reduce
the code size and to improve the efficiency of the instruction cache.

This change was extensively tested by Axel Schwenke with and without
--enable-performance-schema (with no individual instruments enabled).
Removing the inline functions did not cause any performance regression
in either case. There seemed to be a tiny improvement, possibly due
to reduced code size and better instruction cache hit rate.

80ac9ec1

Merge 10.5 into 10.6 · 33aec68a
Marko Mäkelä authored Mar 02, 2021

33aec68a

MDEV-24811 Assertion find(table) failed with innodb_evict_tables_on_commit_debug · 18535a40

Marko Mäkelä authored Mar 01, 2021

lock_release_try(): Implement innodb_evict_tables_on_commit_debug.
Before releasing any locks, collect the identifiers of tables to
be evicted. After releasing all locks, look up for the tables and
evict them if it is safe to do so.

trx_t::commit_tables(): Remove the eviction logic.

trx_t::commit_in_memory(): Invoke release_locks() only after
commit_tables().

18535a40

Cleanup: Remove some lock accessor functions · 8513007c
Marko Mäkelä authored Mar 01, 2021

8513007c

MDEV-24789: Reduce lock_sys mutex contention further · 8d16da14

Marko Mäkelä authored Feb 28, 2021

lock_sys_t::deadlock_check(): Assume that only lock_sys.wait_mutex
is being held by the caller.

lock_sys_t::rd_lock_try(): New function.

lock_sys_t::cancel(trx_t*): Kill an active transaction that may be
holding a lock.

lock_sys_t::cancel(trx_t*, lock_t*): Cancel a waiting lock request.

lock_trx_handle_wait(): Avoid acquiring mutexes in some cases,
and in never acquire lock_sys.latch in exclusive mode.
This function is only invoked in a semi-consistent read
(locking a clustered index record only if it matches the search condition).
Normally, lock_wait() will take care of lock waits.

lock_wait(): Invoke the new function lock_sys_t::cancel() at the end,
to avoid acquiring exclusive lock_sys.latch.

lock_rec_other_trx_holds_expl(): Use LockGuard instead of LockMutexGuard.

lock_release_autoinc_locks(): Explicitly acquire table->lock_mutex,
in case only a shared lock_sys.latch is being held. Deadlock::report()
will still hold exclusive lock_sys.latch while invoking
lock_cancel_waiting_and_release().

lock_cancel_waiting_and_release(): Acquire trx->mutex in this function,
instead of expecting the caller to do so.

lock_unlock_table_autoinc(): Only acquire shared lock_sys.latch.

lock_table_has_locks(): Do not acquire lock_sys.latch at all.

Deadlock::check_and_resolve(): Only acquire shared lock_sys.latchm
for invoking lock_sys_t::cancel(trx, wait_lock).

innobase_query_caching_table_check_low(),
row_drop_tables_for_mysql_in_background(): Do not acquire lock_sys.latch.

8d16da14

MDEV-25026 Various code paths are accessing freed pages · 01b44c05

Marko Mäkelä authored Mar 02, 2021

The test case encryption.innodb_encrypt_freed was failing in
MemorySanitizer builds.

recv_recover_page(): Mark non-recovered pages as freed.

fil_crypt_rotate_page(): Before comparing the block->frame contents,
check if the block was marked as freed.

Other places: Whenever using BUF_GET_POSSIBLY_FREED, check the
block->page.status before accessing the page frame.

(Both uses of BUF_GET_IF_IN_POOL should be correct now.)

01b44c05

01 Mar, 2021 3 commits

MDEV-24858 SIGABRT in DbugExit from my_malloc in Query_cache::init_cache Regression · 1f1f61a9
Sergei Golubchik authored Mar 01, 2021
```
disable warnings, as they're different on 32bit platforms

Closes #1757
```
1f1f61a9
MDEV-24858 SIGABRT in DbugExit from my_malloc in Query_cache::init_cache Regression · 6976bb94
Nayuta Yanagisawa authored Feb 20, 2021
```
Add missing DBUG_RETURN to my_malloc.
```
6976bb94

MDEV-20715 : Implement system variable to disallow local GTIDs in Galera · ebb2db59

Jan Lindström authored Mar 01, 2021

Added a new wsrep_mode feature DISALLOW_LOCAL_GTID for this.
Nodes can have GTIDs for local transactions in the following scenarios:

A DDL statement is executed with wsrep_OSU_method=RSU set.
A DML statement writes to a non-InnoDB table.
A DML statement writes to an InnoDB table with wsrep_on=OFF set.

If user has set wsrep_mode=DISALLOW_LOCAL_GTID these operations
produce a error ERROR HY000: Galera replication not supported

ebb2db59

26 Feb, 2021 6 commits

MDEV-24997 Assertion mtr->is_named_space(page_id.space()) in ibuf0ibuf.cc:624 · 8d714db6

Thirunarayanan Balathandayuthapani authored Feb 26, 2021

- This is caused by commit deadec4e
(MDEV-24569). InnoDB fails to set the tablespace associated with
mini-transacton while resetting the change buffer bitmap bits of
the page.

8d714db6

Merge 10.5 into 10.6 · b47304eb
Marko Mäkelä authored Feb 26, 2021

b47304eb

MDEV-24789: Reduce lock_sys.wait_mutex contention · 7cf4419f

Marko Mäkelä authored Feb 26, 2021

A performance regression was introduced by
commit e71e6133 (MDEV-24671)
and mostly addressed by
commit 455514c8.

The regression is likely caused by increased contention
lock_sys.latch (former lock_sys.mutex), possibly indirectly
caused by contention on lock_sys.wait_mutex. This change aims to
reduce both, but further improvements will be needed.

lock_wait(): Minimize the lock_sys.wait_mutex hold time.

lock_sys_t::deadlock_check(): Add a parameter for indicating
whether lock_sys.latch is exclusively locked.

trx_t::was_chosen_as_deadlock_victim: Always use atomics.

lock_wait_wsrep(): Assume that no mutex is being held.

Deadlock::report(): Always kill the victim transaction.

lock_sys_t::timeout: New counter to back MONITOR_TIMEOUT.

7cf4419f

MDEV-7409 On RBR, extend the PROCESSLIST info to include at least the name of... · d9898c9a

Sachin authored Feb 21, 2021

MDEV-7409 On RBR, extend the PROCESSLIST info to include at least the name of the recently used table

When RBR is used, add the db name to db Field and table name to Status
Field of the "SHOW FULL PROCESSLIST" command for SQL thread.

d9898c9a

Merge remote-tracking branch 10.4 into 10.5 · 1696e4df
Daniel Black authored Feb 26, 2021

1696e4df
Merge remote-tracking branch 'origin/10.4' into 10.5 · 86d60fc9
Daniel Black authored Feb 25, 2021

86d60fc9

25 Feb, 2021 9 commits
- Merge 10.3 into 10.4 · a6c6c4f4
  Marko Mäkelä authored Feb 25, 2021
  
  a6c6c4f4
- Merge 10.2 into 10.3 · 4473d174
  Marko Mäkelä authored Feb 25, 2021
  
  4473d174
- Fixed the innodb_ext_key test by adding replace_column · 0a95c922
  Varun Gupta authored Feb 25, 2021
  
  0a95c922
- MENT-411 : Implement wsrep_replicate_aria · 27d66d64
  Jan Lindström authored Nov 03, 2020
```
Introduced two new wsrep_mode options
* REPLICATE_MYISAM
* REPLICATE_ARIA

Depracated wsrep_replicate_myisam parameter and we use
wsrep_mode = REPLICATE_MYISAM instead.

This required small refactoring of wsrep_check_mode_after_open_table
so that both MyISAM and Aria are handled on required DML cases.
Similarly, added Aria to wsrep_should_replicate_ddl to handle DDL
for Aria tables using TOI. Added test cases and improved MyISAM testing.
Changed use of wsrep_replicate_myisam to wsrep_mode = REPLICATE_MYISAM
```
  27d66d64
- Merge branch '10.3' into 10.4 · ef96ec3b
  Daniel Black authored Feb 25, 2021
  
  ef96ec3b
- mysys: lf_hash - fix l_search size_t keylen · 48b5f8a5
  Daniel Black authored Feb 25, 2021
```
Correcting an incorrect merge from 10.2
```
  48b5f8a5
- Merge branch '10.3' into 10.4 · 36810342
  Daniel Black authored Feb 25, 2021
  
  36810342
- Merge remote-tracking branch 'origin/10.2' into 10.3 · 3e2afcb3
  Daniel Black authored Feb 25, 2021
  
  3e2afcb3
- MDEV-24728: Debian include client caching_sha2_password plugin · 577c970c
  Daniel Black authored Jan 29, 2021
```
Backport of 4bc31a90

Include client libraries for auth caching_sha2_password and
sha256_password in the libmariadb3 client library package.
```
  577c970c
24 Feb, 2021 11 commits

MDEV-23510: arm64 lf_hash alignment of pointers · e0ba68ba

Daniel Black authored Feb 15, 2021

Like the 10.2 version 1635686b,
except C++ on internal functions for my_assume_aligned.

volatile != atomic.

volatile has no memory barrier schemantics, its for mmaped IO
so lets allow some optimizer gains and stop pretending it helps
with memory atomicity.

The MDEV lists a SEGV an assumption is made that an address was
partially read. As C packs structs strictly in order and on arm64 the
cache line size is 128 bits. A pointer (link - 64 bits), followed
by a hashnr (uint32 - 32 bits), leaves the following key (uchar *
64 bits), neither naturally aligned to any pointer and worse, split
across a cache line which is the processors view of an atomic
reservation of memory.

lf_dynarray_lvalue is assumed to return a 64 bit aligned address.

As a solution move the 32bit hashnr to the end so we don't get the
*key pointer split across two cache lines.

Tested by: Krunal Bauskar
Reviewer: Marko Mäkelä

e0ba68ba

MDEV-23510: arm64 lf_hash alignment of pointers · 1635686b

Daniel Black authored Feb 15, 2021

volatile != atomic.

volatile has no memory barrier schemantics, its for mmaped IO
so lets allow some optimizer gains and stop pretending it helps
with memory atomicity.

The MDEV lists a SEGV an assumption is made that an address was
partially read. As C packs structs strictly in order and on arm64 the
cache line size is 128 bits. A pointer (link - 64 bits), followed
by a hashnr (uint32 - 32 bits), leaves the following key (uchar *
64 bits), neither naturally aligned to any pointer and worse, split
across a cache line which is the processors view of an atomic
reservation of memory.

lf_dynarray_lvalue is assumed to return a 64 bit aligned address.

As a solution move the 32bit hashnr to the end so we don't get the
*key pointer split across two cache lines.

Tested by: Krunal Bauskar
Reviewer: Marko Mäkelä

1635686b

MDEV-24910 Crash with SELECT that uses table value constructor as a subselect · bf6484e7

Igor Babaev authored Feb 24, 2021

This bug caused crashes of the server when processing queries with table
value constructors (TVC) that contained subqueries and were used itself as
subselects. For such TVCs the following transformation is applied at the
prepare stage:
VALUES (v1), ... (vn) => SELECT * FROM (VALUES (v1), ... (vn)) tvc_x.
This transformation allows to reduce the problem of evaluation of TVCs used
as subselects to the problem of evaluation of regular subselects.
The transformation is implemented in the wrap_tvc(). The code the function
to mimic the behaviour of the parser when processing the result of the
transformation. However this imitation was not free of some flaws. First
the function called the method exclude() that completely destroyed the
select tree structures below the transformed TVC. Second the function
used the procedure mysql_new_select to create st_select_lex nodes for
both wrapping select of the transformation and TVC. This also led to
constructing of invalid select tree structures.
The patch actually re-engineers the code of wrap_tvc().

Approved by Oleksandr Byelkin <sanja@mariadb.com>

bf6484e7

MDEV-24964 : Heap-buffer-overflow on wsrep_schema.cc ::remove_fragments · d1eeb4b8

Jan Lindström authored Feb 24, 2021

Problem was that we used heap allocated key using too small
array. Fixed by using dynamic memory allocation using actual
needed size.

d1eeb4b8

MDEV-24884 fixup: Remove a bogus assertion · 74281fe1

Marko Mäkelä authored Feb 24, 2021

rw_lock::upgrade_trylock(): If the compare-and-swap fails,
only assert that we are still holding the U lock and that
no conflicting lock exists. If the upgrade to X would fail due
to some thread holding an S latch, we will terminate the loop.

74281fe1

MDEV-24951 Assertion m.first->second.valid(trx->undo_no) failed · 5c9229b9

Marko Mäkelä authored Feb 24, 2021

trx_t::commit_in_memory(): Invoke mod_tables.clear().

trx_free_at_shutdown(): Invoke mod_tables.clear() for transactions
that are discarded on shutdown.

Everywhere else, assert mod_tables.empty() on freed transaction objects.

5c9229b9

MDEV-20612 fixup: Reduce hash table lookups · 21987e59

Marko Mäkelä authored Feb 22, 2021

Let us calculate the hash table cell address while we are calculating
the latch address, to avoid repeated computations of the address.
The latch address can be derived from the cell address with a simple
bitmask operation.

21987e59

MDEV-24967 : Signal 11 on ha_innodb.cc::bg_wsrep_kill_trx line 18611 · f2428b9c
Jan Lindström authored Feb 24, 2021
```
Null poiter reference in case where bf_thd has no trx .e.g. when
we have MDL-conflict.
```
f2428b9c
MDEV-24967 : Signal 11 on ha_innodb.cc::bg_wsrep_kill_trx line 18611 · cea03285
Jan Lindström authored Feb 24, 2021
```
Null poiter reference in case where bf_thd has no trx .e.g. when
we have MDL-conflict.
```
cea03285

MDEV-24953: 10.5.9 crashes with large IN() list · f83e2ecc

Sergei Petrunia authored Feb 23, 2021

The problem was in and_all_keys(), the code of MDEV-9759 which calculates
the new tree weight:

First, it didn't take into account the case when

(next->next_key_part=tmp) == NULL

and dereferenced a NULL pointer when getting tmp->weight.

Second, "if (param->alloced_sel_args > SEL_ARG::MAX_SEL_ARGS) break"
could leave the loop with incorrect value of weight.

Fixed by introducing SEL_ARG::update_weight_locally() and calling it
at the end of the function. This allows to avoid caring about all the
above cases.

f83e2ecc

MDEV-20857: perf schema conflict name filename_hash · 2628fa2d

Daniel Black authored Feb 05, 2021

filename_hash is a function from libiberty.a from the system
but also an expored name in the perf schema static library.

We'll use a different name.

2628fa2d