Commits · 9d1466522ea92963ac6ca16b597392714280c9f1 · nexedi / MariaDB

30 Aug, 2023 3 commits

MDEV-32029 Assertion failures in log_sort_flush_list upon crash recovery · 9d146652

Marko Mäkelä authored Aug 30, 2023

In commit 0d175968 (MDEV-31354)
we only waited that no buf_pool.flush_list writes are in progress.
The buf_flush_page_cleaner() thread could still initiate page writes
from the buf_pool.LRU list while only holding buf_pool.mutex, not
buf_pool.flush_list_mutex. This is something that was changed in
commit a55b951e (MDEV-26827).

log_sort_flush_list(): Wait for the buf_flush_page_cleaner() thread to
be completely idle, including LRU flushing.

buf_flush_page_cleaner(): Always broadcast buf_pool.done_flush_list
when becoming idle, so that log_sort_flush_list() will be woken up.
Also, ensure that buf_pool.n_flush_inc() or
buf_pool.flush_list_set_active() has been invoked before any page
writes are initiated.

buf_flush_try_neighbors(): Release buf_pool.mutex here and not in the
callers, to avoid code duplication. Make innodb_flush_neighbors=ON
obey the innodb_io_capacity limit.

9d146652

MDEV-30986 Slow full index scan for I/O bound case · 31ea201e

Marko Mäkelä authored Aug 30, 2023

buf_page_init_for_read(): Test a condition before acquiring a latch,
not while holding it.

buf_read_ahead_linear(): Do not use a memory transaction, because it
could be too large, leading to frequent retries.
Release the hash_lock as early as possible.

31ea201e

MDEV-31545 Revert "Fix gcc warning for wsrep_plug" · 9b1b4a6f
Daniel Black authored Aug 30, 2023
```
This reverts commit 38fe266e.

The correct fix was pushed to the 10.4 branch
(fbc157ab)
```
9b1b4a6f

25 Aug, 2023 5 commits

MDEV-31835 Remove unnecesary extra HA_EXTRA_IGNORE_INSERT call · c4382848

Thirunarayanan Balathandayuthapani authored Aug 25, 2023

- HA_EXTRA_IGNORE_INSERT call is being called for every inserted row,
and on partitioned tables on every row * every partition.
This leads to slowness during load..data operation

- Under bulk operation, multiple insert statement error handling
will end up emptying the table. This behaviour introduced by the
commit 8ea923f5 (MDEV-24818).
This makes the HA_EXTRA_IGNORE_INSERT call redundant. We can
use the same behavior for insert..ignore statement as well.

- Removed the extra call HA_EXTRA_IGNORE_INSERT as the solution
to improve the performance of load command.

c4382848

Clean up buf_LRU_remove_hashed() · 08a549c3

Marko Mäkelä authored Aug 25, 2023

buf_LRU_block_remove_hashed(): Test for "not ROW_FORMAT=COMPRESSED" first,
because in that case we can assume that an uncompressed page exists.
This removes a condition from the likely code branch.

08a549c3

MDEV-30100: Assertion purge_sys.tail.trx_no <= purge_sys.rseg->last_trx_no() · f7780a8e

Marko Mäkelä authored Aug 25, 2023

trx_t::commit_empty(): A special case of transaction "commit" when
the transaction was actually rolled back or the persistent undo log
is empty. In this case, we need to change the undo log header state to
TRX_UNDO_CACHED and move the undo log from rseg->undo_list to
rseg->undo_cached for fast reuse. Furthermore, unless this is the only
undo log record in the page, we will remove the record and rewind
TRX_UNDO_PAGE_START, TRX_UNDO_PAGE_FREE, TRX_UNDO_LAST_LOG.

We must also ensure that the system-wide transaction identifier
will be persisted up to this->id, so that there will not be warnings or
errors due to a PAGE_MAX_TRX_ID being too large. We might have modified
secondary index pages before being rolled back, and any changes of
PAGE_MAX_TRX_ID are never rolled back.

Even though it is not going to be written persistently anywhere,
we will invoke trx_sys.assign_new_trx_no(this), so that in the test
innodb.instant_alter everything will be purged as expected.

trx_t::write_serialisation_history(): Renamed from
trx_write_serialisation_history(). If there is no undo log,
invoke commit_empty().

trx_purge_add_undo_to_history(): Simplify an assertion and remove a
comment. This function will not be invoked on an empty undo log anymore.

trx_undo_header_create(): Add a debug assertion.

trx_undo_mem_create_at_db_start(): Remove a duplicated assignment.

Reviewed by: Vladislav Lesin
Tested by: Matthias Leich

f7780a8e

MDEV-30100 preparation: Simplify InnoDB transaction commit further · 4ff5311d

Marko Mäkelä authored Aug 25, 2023

trx_commit_complete_for_mysql(): Remove some conditions.
We will rely on trx_t::commit_lsn.

trx_t::must_flush_log_later: Remove. trx_commit_complete_for_mysql()
can simply check for trx_t::flush_log_later.

trx_t::commit_in_memory(): Set commit_lsn=0 if the log was written.

trx_flush_log_if_needed_low(): Renamed to trx_flush_log_if_needed().
Assert that innodb_flush_log_at_trx_commit!=0 was checked by
the caller and that the transaction is not in XA PREPARE state.
Unconditionally flush the log for data dictionary transactions,
to ensure the correct processing of ddl_recovery.log.

trx_write_serialisation_history(): Move some code from
trx_purge_add_undo_to_history().

trx_prepare(): Invoke log_write_up_to() directly if needed.

innobase_commit_ordered_2(): Simplify some conditions.
A read-write transaction will always carry nonzero trx_t::id.
Let us unconditionally reset mysql_log_file_name, flush_log_later
after trx_t::commit() was invoked.

4ff5311d

MDEV-30100 preparation: Simplify InnoDB transaction commit · f4bbea90

Marko Mäkelä authored Aug 25, 2023

trx_commit_cleanup(): Clean up any temporary undo log.
Replaces trx_undo_commit_cleanup() and trx_undo_seg_free().

trx_write_serialisation_history(): Commit the mini-transaction.
Do not touch temporary undo logs. Assume that a persistent rollback
segment has been assigned.

trx_serialise(): Merged into trx_write_serialisation_history().

trx_t::commit_low(): Correct some comments and assertions.

trx_t::commit_persist(): Only invoke commit_low() on a mini-transaction
if the persistent state needs to change.

f4bbea90

24 Aug, 2023 3 commits

Merge 10.5 into 10.6 · eda75cad
Marko Mäkelä authored Aug 24, 2023

eda75cad
Merge 10.4 into 10.5 · aeb8eae5
Marko Mäkelä authored Aug 24, 2023

aeb8eae5

MDEV-31813 SET GLOBAL innodb_max_purge_lag_wait hangs if innodb_read_only · 02878f12

Marko Mäkelä authored Aug 24, 2023

innodb_max_purge_lag_wait_update(): Return immediately if we are
in high_level_read_only mode.

srv_wake_purge_thread_if_not_active(): Relax a debug assertion.
If srv_read_only_mode holds, purge_sys.enabled() will not hold
and this function will do nothing.

trx_t::commit_in_memory(): Remove a redundant condition before
invoking srv_wake_purge_thread_if_not_active().

02878f12

23 Aug, 2023 3 commits

Merge 10.5 into 10.6 · 4e7d2e73
Yuchen Pei authored Aug 23, 2023

4e7d2e73
Merge 10.4 into 10.5 · 0d88365b
Yuchen Pei authored Aug 23, 2023

0d88365b

MDEV-31117 Fix spider connection info parsing · e9f3ca61

Yuchen Pei authored Jul 05, 2023

Spider connection string is a comma-separated parameter definitions,
where each definition is of the form "<param_title> <param_value>",
where <param_value> is quote delimited on both ends, with backslashes
acting as an escaping prefix.

Despite the simple syntax, the existing spider connection string
parser was poorly-written, complex, hard to reason and error-prone,
causing issues like the one described in MDEV-31117. For example it
treated param title the same way as param value when assigning, and
have nonsensical fields like delim_title_len and delim_title.

Thus as part of the bugfix, we clean up the spider comment connection
string parsing, including:

- Factoring out some code from the parsing function
- Simplify the struct `st_spider_param_string_parse`
- And any necessary changes caused by the above changes

e9f3ca61

22 Aug, 2023 3 commits

Merge 10.5 into 10.6 · 07494006
Marko Mäkelä authored Aug 22, 2023

07494006
Merge 10.4 into 10.5 · f9cc2982
Marko Mäkelä authored Aug 22, 2023

f9cc2982

MDEV-20194 test adjustment for s390x · ff682ead

Marko Mäkelä authored Aug 22, 2023

The test innodb.row_size_error_log_warnings_3 that was added in
commit 372b0e63 (MDEV-20194)
failed to take into account the earlier adjustment in
commit cf574cf5 (MDEV-27634)
that is specific to many GNU/Linux distributions for the s390x.

ff682ead

21 Aug, 2023 5 commits

Remove bogus references to replaced Google contributions · a60462d9

Marko Mäkelä authored Aug 21, 2023

In commit 03ca6495 and
commit ff5d306e
we forgot to remove some Google copyright notices related to
a contribution of using atomic memory access in the old InnoDB
mutex_t and rw_lock_t implementation.

The copyright notices had been mostly added in
commit c6232c06
due to commit a1bb700f.

The following Google contributions remain:
* some logic related to the parameter innodb_io_capacity
* innodb_encrypt_tables, added in MariaDB Server 10.1

a60462d9

Clean up buf0buf.inl · 6cc88c3d

Marko Mäkelä authored Aug 21, 2023

Let us move some #include directives from buf0buf.inl to
the compilation units where they are really used.

6cc88c3d

Merge 10.5 into 10.6 · 448c2077
Marko Mäkelä authored Aug 21, 2023

448c2077
Make vgdb call more universal. · c062b351
Oleksandr Byelkin authored Aug 21, 2023

c062b351

Remove a stale comment · be5fd3ec

Marko Mäkelä authored Aug 21, 2023

buf_LRU_block_remove_hashed(): Remove a comment that had been added
in mysql/mysql-server@aad1c7d0dd8a152ef6bb685356c68ad9978d686a
and apparently referring to buf_LRU_invalidate_tablespace(),
which was later replaced with buf_LRU_flush_or_remove_pages() and
ultimately with buf_flush_remove_pages() and buf_flush_list_space().
All that code is covered by buf_pool.mutex. The note about releasing
the hash_lock for the buf_pool.page_hash slice would actually apply to
the last reference to hash_lock in buf_LRU_free_page(), for the
case zip=false (retaining a ROW_FORMAT=COMPRESSED page while
discarding the uncompressed one).

be5fd3ec

18 Aug, 2023 1 commit

MDEV-29693 ANALYZE TABLE still flushes table definition cache when... · a6bf4b58

Monty authored Aug 05, 2023

MDEV-29693 ANALYZE TABLE still flushes table definition cache when engine-independent statistics is used

This commits enables reloading of engine-independent statistics
without flushing the table from table definition cache.

This is achieved by allowing multiple version of the
TABLE_STATISTICS_CB object and having independent pointers to it in
TABLE and TABLE_SHARE. The TABLE_STATISTICS_CB object have reference
pointers and are freed when no one is pointing to it anymore.

TABLE's TABLE_STATISTICS_CB pointer is updated to use the
TABLE_SHARE's pointer when read_statistics_for_tables() is called at
the beginning of a query.

Main changes:
- read_statistics_for_table() will allocate an new TABLE_STATISTICS_CB
object.
- All get_stat_values() functions has a new parameter that tells
where collected data should be stored. get_stat_values() are not
using the table_field object anymore to store data.
- All get_stat_values() functions returns 1 if they found any
data in the statistics tables.

Other things:
- Fixed INSERT DELAYED to not read statistics tables.
- Removed Statistics_state from TABLE_STATISTICS_CB as this is not
needed anymore as wer are not changing TABLE_SHARE->stats_cb while
calculating or loading statistics.
- Store values used with store_from_statistical_minmax_field() in
TABLE_STATISTICS_CB::mem_root. This allowed me to remove the function
delete_stat_values_for_table_share().
- Field_blob::store_from_statistical_minmax_field() is implemented
but is not normally used as we do not yet support EIS statistics
for blobs. For example Field_blob::update_min() and
Field_blob::update_max() are not implemented.
Note that the function can be called if there is an concurrent
"ALTER TABLE MODIFY field BLOB" running because of a bug in
ALTER TABLE where it deletes entries from column_stats
before it has an exclusive lock on the table.
- Use result of field->val_str(&val) as a pointer to the result
instead of val (safetly fix).
- Allocate memory for collected statistics in THD::mem_root, not in
in TABLE::mem_root. This could cause the TABLE object to grow if a
ANALYZE TABLE was run many times on the same table.
This was done in allocate_statistics_for_table(),
create_min_max_statistical_fields_for_table() and
create_min_max_statistical_fields_for_table_share().
- Store in TABLE_STATISTICS_CB::stats_available which statistics was
found in the statistics tables.
- Removed index_table from class Index_prefix_calc as it was not used.
- Added TABLE_SHARE::LOCK_statistics to ensure we don't load EITS
in parallel. First thread will load it, others will reuse the
loaded data.
- Eliminate read_histograms_for_table(). The loading happens within
read_statistics_for_tables() if histograms are needed.
One downside is that if we have read statistics without histograms
before and someone requires histograms, we have to read all statistics
again (once) from the statistics tables.
A smaller downside is the need to call alloc_root() for each
individual histogram. Before we could allocate all the space for
histograms with a single alloc_root.
- Fixed bug in MyISAM and Aria where they did not properly notice
that table had changed after analyze table. This was not a problem
before this patch as then the MyISAM and Aria tables where flushed
as part of ANALYZE table which did hide this issue.
- Fixed a bug in ANALYZE table where table->records could be seen as 0
in collect_statistics_for_table(). The effect of this unlikely bug
was that a full table scan could be done even if
analyze_sample_percentage was not set to 1.
- Changed multiple mallocs in a row to use multi_alloc_root().
- Added a mutex protection in update_statistics_for_table() to ensure
that several tables are not updating the statistics at the same time.

Some of the changes in sql_statistics.cc are based on a patch from
Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Oleg Smirnov <olernov@gmail.com>
Co-authored-by: Vicentiu Ciorbaru <cvicentiu@gmail.com>
Reviewer: Sergei Petrunia <sergey@mariadb.com>

a6bf4b58

17 Aug, 2023 4 commits

Merge 10.4 into 10.5 · 5895a362
Marko Mäkelä authored Aug 17, 2023

5895a362
MDEV-31928 Assertion xid ... < 128 failed in trx_undo_write_xid() · 5a8a8fc9
Marko Mäkelä authored Aug 17, 2023
```
trx_undo_write_xid(): Correct an off-by-one error in a debug assertion.
```
5a8a8fc9

MDEV-31254 InnoDB: Trying to read doublewrite buffer page · 518fe519

Marko Mäkelä authored Aug 17, 2023

buf_read_page_low(): Remove an error message that could be triggered
by buf_read_ahead_linear() or buf_read_ahead_random().

This is a backport of commit c9eff1a1
from MariaDB Server 10.5.

518fe519

MDEV-31875 ROW_FORMAT=COMPRESSED table: InnoDB: ... Only 0 bytes read · 44df6f35

Marko Mäkelä authored Aug 17, 2023

buf_read_ahead_random(), buf_read_ahead_linear(): Avoid read-ahead
of the last page(s) of ROW_FORMAT=COMPRESSED tablespaces that use
a page size of 1024 or 2048 bytes. We invoke os_file_set_size() on
integer multiples of 4096 bytes in order to be compatible with
the requirements of innodb_flush_method=O_DIRECT regardless of the
physical block size of the underlying storage.

This change must be null-merged to MariaDB Server 10.5 and later.
There, out-of-bounds read-ahead should be handled gracefully
by simply discarding the buffer page that had been allocated.

Tested by: Matthias Leich

44df6f35

16 Aug, 2023 2 commits

MDEV-29974: Missed kill waiting for worker queues to drain · 34e85854

Kristian Nielsen authored Aug 16, 2023

When the SQL driver thread goes to wait for room in the parallel slave
worker queue, there was a race where a kill at the right moment could
be ignored and the wait proceed uninterrupted by the kill.

Fix by moving the THD::check_killed() to occur _after_ doing ENTER_COND().

This bug was seen as sporadic failure of the testcase rpl.rpl_parallel
(rpl.rpl_parallel_gco_wait_kill since 10.5), with "Slave stopped with
wrong error code".
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

34e85854

After-merge cleanup for MDEV-27207 + MDEV-31719 · 88dd50b8

Alexander Barkov authored Aug 16, 2023

Something went wrong during a merge (from 10.5 to 10.6)
of 68403eed
(fixing bugs MDEV-27207 and MDEV-31719).

Originally (in 10.5) the fix was done in_inet6::set() in
plugin/type_inet/sql_type_inet.cc.
In 10.6 this code resides in a different place:
in the method in_fbt::set() of a template class
in sql/sql_type_fixedbin.h.

During the merge:
- the fix did not properly migrate to in_fbt::set()
- the related MTR tests disappeared

This patch fixes in_fbt::set() properly and restores MTR tests.

88dd50b8

15 Aug, 2023 11 commits

MDEV-9938 Prepared statement return wrong result (missing row) · ca5c122a

Monty authored Aug 11, 2023

The problem is that the first execution of the prepared statement makes
a permanent optimization of converting the LEFT JOIN to an INNER JOIN.

This is based on the assumption that all the user parameters (?) are
always constants and that parameters to Item_cond() will not change value
from true and false between different executions.

(The example was using IS NULL, which will change value if parameter
depending on if the parameter is NULL or not).

The fix is to change Item_cond::fix_fields() and
Item_cond::eval_not_null_tables() to not threat user parameters as
constants. This will ensure that we don't do the LEFT_JOIN -> INNER
JOIN conversion that causes problems.

There is also some things that needs to be improved regarding
calculations of not_null_tables_cache as we get a different value for
WHERE 1 or t1.a=1
compared to
WHERE t1.a= or 1

Changes done:
- Mark Item_param with the PARAM flag to be able to quickly check
  in Item_cond::eval_not_null_tables() if an item contains a
  prepared statement parameter (just like we check for stored procedure
  parameters).
- Fixed that Item_cond::not_null_tables_cache is not depending on
  order of arguments.
- Don't call item->eval_const_cond() for items that are NOT on the top
  level of the WHERE clause. This removed a lot of unnecessary
  warnings in the test suite!
- Do not reset not_null_tables_cache for not top level items.
- Simplified Item_cond::fix_fields by calling eval_not_null_tables()
  instead of having duplication of all the code in
  eval_not_null_tables().
- Return an error if Item_cond::fix_field() generates an error
  The old code did generate an error in some cases, but not in all
   cases.
  - Fixed all handling of the above error in make_cond_for_tables().
    The error handling by the callers did not exists before which
    could lead to asserts in many different places in the old code).
  - All changes in sql_select.cc are just checking the return value of
    fix_fields() and make_cond_for_tables() and returning an error
    value if fix_fields() returns true or make_cond_for_tables()
    returns NULL and is_error() is set.
- Mark Item_cond as const_item if all arguments returns true for
  can_eval_in_optimize().

Reviewer: Sergei Petrunia <sergey@mariadb.com>

ca5c122a

(Null) Merge 10.5 -> 10.6 · f6dd1308
Kristian Nielsen authored Aug 15, 2023
```
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
```
f6dd1308
Merge 10.4 into 10.5 · 7c9837ce
Kristian Nielsen authored Aug 15, 2023
```
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
```
7c9837ce

MDEV-31482: Lock wait timeout with INSERT-SELECT, autoinc, and statement-based replication · 805e0668

Kristian Nielsen authored Jul 09, 2023

Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.

There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:

1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2

Here, T3 needs to be deadlock killed on the wait by T1.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

805e0668

MDEV-31655: Parallel replication deadlock victim preference code errorneously removed · 18acbaf4

Kristian Nielsen authored Aug 05, 2023

Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.

Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:

MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

18acbaf4

MDEV-31655: Parallel replication deadlock victim preference code errorneously removed · 900c4d69

Kristian Nielsen authored Jul 11, 2023

Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.

Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:

MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master

Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB locking code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

900c4d69

MDEV-31482: Lock wait timeout with INSERT-SELECT, autoinc, and statement-based replication · 920789e9

Kristian Nielsen authored Jul 09, 2023

Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.

There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:

1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2

Here, T3 needs to be deadlock killed on the wait by T1.

Note: This should be null-merged to 10.6, as a different fix is needed
there due to InnoDB lock code changes.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>

920789e9

Remove the often-hanging test innodb.alter_rename_files · b4ace139

Marko Mäkelä authored Aug 15, 2023

The test innodb.alter_rename_files rather frequently hangs in
checkpoint_set_now. The test was removed in MariaDB Server 10.5
commit 37e7bde1 when the code that
it aimed to cover was simplified. Starting with MariaDB Server 10.5
the page flushing and log checkpointing is much simpler, handled
by the single buf_flush_page_cleaner() thread.

Let us remove the test to avoid occasional failures. We are not going
to fix the cause of the failure in MariaDB Server 10.4.

b4ace139

Merge 10.5 into 10.6 · 3fee1b44
Marko Mäkelä authored Aug 15, 2023

3fee1b44
Merge 10.4 into 10.5 · 599c4d9a
Marko Mäkelä authored Aug 15, 2023

599c4d9a
Merge mariadb-10.4.31 into 10.4 · 6fdc6846
Marko Mäkelä authored Aug 15, 2023

6fdc6846