Commits · 21d03cb08a6a0ebd17fc17b9bf77300a275fb989 · nexedi / MariaDB

26 Oct, 2021 19 commits

MDEV-26654 ROW_NUMBER is wrong upon INSERT into Federated table · 21d03cb0
Sergei Golubchik authored Oct 07, 2021
```
just a test case
```
21d03cb0

MDEV-26693 ROW_NUMBER is wrong upon INSERT or UPDATE on Spider table · 5c0b6345

Sergei Golubchik authored Oct 07, 2021

in case of a bulk insert the server sends all rows to the engine, and
then the engine replies that there was ER_DUP_ENTRY somewhere.

the exact number of the row that caused the error is unknown.

5c0b6345

fix RESIGNAL to save and pass the m_row_count too · 9bbd3282
Sergei Golubchik authored Oct 08, 2021

9bbd3282

refactor THD::raise_condition() family · b73b7365

Sergei Golubchik authored Oct 08, 2021

to remove

Sql_condition*
raise_condition(const Sql_condition *cond)
{
  Sql_condition *raised= raise_condition(cond->get_sql_errno(),
                                         cond->get_sqlstate(),
                                         cond->get_level(),
                                         *cond,
                                         cond->get_message_text());
  return raised;
}

b73b7365

MDEV-26635 ROW_NUMBER is not 0 for errors not caused because of rows · a398fcbf
Sergei Golubchik authored Oct 02, 2021

a398fcbf
the error should be on the second row, not the first · f845a983
Sergei Golubchik authored Oct 02, 2021
```
otherwise how can we know that the row counter is incremented?
```
f845a983

MDEV-26832: ROW_NUMBER in SIGNAL/RESIGNAL causes a syntax error · ff5de38d

Rucha Deodhar authored Oct 15, 2021

Analysis: Parser was missing ROW_NUMBER as syntax for SIGNAL and RESIGNAL.
Fix: Fix parser and fix how m_row_number is copied like other attributes
to avoid ROW_NUMBER from assuming default value.

ff5de38d

MDEV-26767 Server crashes when rename table and alter storage engine · b15a5f6f

Aleksey Midenkov authored Oct 11, 2021

Wrong assertion leftover removed. m_sql_cmd can be allocated by any
ALTER subcommand and before allocation it is checked for NULL first.

b15a5f6f

MDEV-22165 CONVERT TABLE: move in partition from existing table · 69724805

Aleksey Midenkov authored Sep 28, 2021

Syntax for CONVERT TABLE

ALTER TABLE tbl_name CONVERT TABLE tbl_name TO PARTITION partition_name partition_spec

Examples:

    ALTER TABLE t1 CONVERT TABLE tp2 TO PARTITION p2 VALUES LESS THAN MAX_VALUE();

New ALTER_PARTITION_CONVERT_IN command for
fast_alter_partition_table() is done in alter_partition_convert_in()
function which basically does ha_rename_table().

Table structure and data check is basically the same as in EXCHANGE
PARTITION command. And these are done by
compare_table_with_partition() and check_table_data().

Atomic DDL is done by the scheme from MDEV-22166 (see the
corresponding commit message). The only differnce is that it also has
to drop source table frm and that is done by WFRM_DROP_CONVERTED_FROM.

Initial patch was done by Dmitry Shulga <dmitry.shulga@mariadb.com>

69724805

Review and crash-safety fix · 7da721be
Aleksey Midenkov authored Sep 27, 2021

7da721be
cleanup: reduce error injection noise in partitioning · 42802452
Sergei Golubchik authored Sep 12, 2021

42802452

MDEV-22166 CONVERT PARTITION: move out partition into a table · b7bba721

Aleksey Midenkov authored Sep 09, 2021

Syntax for CONVERT keyword

ALTER TABLE tbl_name
    [alter_option [, alter_option] ...] |
    [partition_options]

partition_option: {
    ...
    | CONVERT PARTITION partition_name TO TABLE tbl_name
}

Examples:

    ALTER TABLE t1 CONVERT PARTITION p2 TO TABLE tp2;

New ALTER_PARTITION_CONVERT_OUT command for
fast_alter_partition_table() is done in alter_partition_convert_out()
function which basically does ha_rename_table().

Partition to extract is marked with the same flag as dropped
partition: PART_TO_BE_DROPPED. Note that we cannot have multiple
partitioning commands in one ALTER.

For DDL logging basically the principle is the same as for other
fast_alter_partition_table() commands. The only difference is that it
integrates late Atomic DDL functions and introduces additional phase
of WFRM_BACKUP_ORIGINAL. That is required for binlog consistency
because otherwise we could not revert back after WFRM_INSTALL_SHADOW
is done. And before DDL log is complete if we crash or fail the
altered table will be already new but binlog will miss that ALTER
command. Note that this is different from all other atomic DDL in that
it rolls back until the ddl_log_complete() is done even if everything
was done fully before the crash.

Test cases added to:

  parts.alter_table \
  parts.partition_debug \
  versioning.partition \
  atomic.alter_partition

b7bba721

MDEV-26471 Syntax extension: do not require PARTITION keyword in partition definition · f6b0e34c

Aleksey Midenkov authored Sep 09, 2021

Instead of

  create or replace table t1 (x int)
  partition by range(x) (
    partition p1 values less than (10),
    partition pn values less than maxvalue);

it should be possible to type in shorter form:

  create or replace table t1 (x int)
  partition by range(x) (
    p1 values less than (10),
    pn values less than maxvalue);

As above examples demonstrate, make PARTITION keyword in partition
definition optional.

f6b0e34c

MDEV-22165: Prerequisite patch that adds missing data member initializers in... · 379ddf49

Dmitry Shulga authored Aug 26, 2021

MDEV-22165: Prerequisite patch that adds missing data member initializers in constructors of the class Alter_table_ctx

Static analyzer built in Eclipse CDT complained about missing initializers in
constructors of the class Alter_table_ctx so I've added them in order to
eliminate annoying warnings.

379ddf49

Vanilla cleanups and refactorings · d324c03d

Aleksey Midenkov authored Sep 09, 2021

Dead code cleanup:

part_info->num_parts usage was wrong and working incorrectly in
mysql_drop_partitions() because num_parts is already updated in
prep_alter_part_table(). We don't have to update part_info->partitions
because part_info is destroyed at alter_partition_lock_handling().

Cleanups:

- DBUG_EVALUATE_IF() macro replaced by shorter form DBUG_IF();
- Typo in ER_KEY_COLUMN_DOES_NOT_EXITS.

Refactorings:

- Splitted write_log_replace_delete_frm() into write_log_delete_frm()
  and write_log_replace_frm();
- partition_info via DDL_LOG_STATE;
- set_part_info_exec_log_entry() removed.

DBUG_EVALUATE removed

DBUG_EVALUTATE was only added for consistency together with
DBUG_EVALUATE_IF. It is not used anywhere in the code.

DBUG_SUICIDE() fix on release build

On release DBUG_SUICIDE() was statement. It was wrong as
DBUG_SUICIDE() is used in expression context.

d324c03d

MDEV-25292 Better debug trace · 2dc3c320
Aleksey Midenkov authored Apr 22, 2021
```
Improves readability of DDL log debug traces.
```
2dc3c320

MDEV-24621 In bulk insert, pre-sort and build indexes one page at a time · 045757af

Thirunarayanan Balathandayuthapani authored Oct 22, 2021

When inserting a number of rows into an empty table,
InnoDB will buffer and pre-sort the records for each index, and
build the indexes one page at a time.

For each index, a buffer of innodb_sort_buffer_size will be created.

If the buffer ran out of memory then we will create temporary files
for storing the data.

At the end of the statement, we will sort and apply the buffered
records. Ideally, we would do this at the end of the transaction
or only when starting to execute a non-INSERT statement on the table.
However, it could be awkward if duplicate keys or similar errors
would be reported during the execution of a later statement.
This will be addressed in MDEV-25036.

Any columns longer than 2000 bytes will buffered in temporary files.

innodb_prepare_commit_versioned(): Apply all bulk buffered insert
operation, at the end of each statement.

ha_commit_trans(): Handle errors from innodb_prepare_commit_versioned().

row_merge_buf_write(): This function should accept blob
file handle too and it should write the field data which are
greater than 2000 bytes

row_merge_bulk_t: Data structure to maintain the data during
bulk insert operation.

trx_mod_table_time_t::start_bulk_insert(): Notify the start of
bulk insert operation and create new buffer for the given table

trx_mod_table_time_t::add_tuple(): Buffer a record.

trx_mod_table_time_t::write_bulk(): Do bulk insert operation
present in the transaction

trx_mod_table_time_t::bulk_buffer_exist(): Whether the buffer
storage exist for the bulk transaction

trx_mod_table_time_t::write_bulk(): Write all buffered insert
operation for the transaction and the table.

row_ins_clust_index_entry_low(): Insert the data into the
bulk buffer if it is already exist.

row_ins_sec_index_entry(): Insert the secondary tuple
if the bulk buffer already exist.

row_merge_bulk_buf_add(): Insert the tuple into bulk buffer
insert operation.

row_merge_buf_blob(): Write the field data whose length is
more than 2000 bytes into blob temporary file. Write the
file offset and length into the tuple field.

row_merge_copy_blob_from_file(): Copy the blob from blob file
handler based on reference of the given tuple.

row_merge_insert_index_tuples(): Handle blob for bulk insert
operation.

row_merge_bulk_t::row_merge_bulk_t(): Constructor. Initialize
the buffer and file for all the indexes expect fts index.

row_merge_bulk_t::create_tmp_file(): Create new temporary file
for the given index.

row_merge_bulk_t::write_to_tmp_file(): Write the content from
buffer to disk file for the given index.

row_merge_bulk_t::add_tuple(): Insert the tuple into the merge
buffer for the given index. If the memory ran out then InnoDB
should sort the buffer and write into file.

row_merge_bulk_t::write_to_index(): Do bulk insert operation
from merge file/merge buffer for the given index

row_merge_bulk_t::write_to_table(): Do bulk insert operation
for all the indexes.

dict_stats_update(): If a bulk insert transaction is in progress,
treat the table as empty. The index creation could hold latches
for extended amounts of time.

045757af

Merge 10.6 into 10.7 · c8e309a6
Marko Mäkelä authored Oct 26, 2021

c8e309a6

MDEV-26903: Assertion ctx->trx->state == TRX_STATE_ACTIVE on DROP INDEX · 58fe6b47

Marko Mäkelä authored Oct 26, 2021

rollback_inplace_alter_table(): Tolerate a case where the transaction
is not in an active state. If ha_innobase::commit_inplace_alter_table()
failed with a deadlock, the transaction would already have been
rolled back. This omission of error handling was introduced in
commit 1bd681c8 (MDEV-25506 part 3).

After commit c3c53926 (MDEV-26554)
it became easier to trigger DB_DEADLOCK during exclusive table lock
acquisition in ha_innobase::commit_inplace_alter_table().

lock_table_low(): Add DBUG injection "innodb_table_deadlock".

58fe6b47

25 Oct, 2021 5 commits

libfmt fix for cmake <3.0 · 2897ef09
Sergei Golubchik authored Oct 25, 2021
```
this is CentOOOOOOS 7
```
2897ef09
MDEV-26890 : Crash on shutdown, with active binlog dump threads · f9339759
Vladislav Vaintroub authored Oct 25, 2021
```
The reason for the crash was a bug in MDEV-19275, after which shutdown
does not wait for binlog threads anymore.
```
f9339759
Fix 32bit build · 30009f29
Vladislav Vaintroub authored Oct 25, 2021

30009f29

MDEV-26674: Set innodb_use_native_aio=OFF when using io_uring on a potentially affected kernel · 1193a793

Marko Mäkelä authored Oct 25, 2021

We have observed hangs of the io_uring subsystem when using a
Linux kernel newer than 5.10. Also 5.15-rc6 is affected by this.

The exact cause of the hangs has not been diagnosed yet.
As a safety measure, we will disable innodb_use_native_aio by default
when the server has been configured with io_uring and the kernel
version is between 5.11 and 5.15.

If the start-up parameter innodb_use_native_aio=ON is set, then
we will issue a warning to the server error log.

1193a793

MDEV-19275 : Fix compiler error - calling covention mismatch (32bit Windows) · 35084c5a
Vladislav Vaintroub authored Oct 25, 2021
```
Error C2440	'initializing': cannot convert from 'MYSQL_RES *(__stdcall *)(MYSQL *)' to 'MYSQL_RES *(__cdecl *)(MYSQL *)'
```
35084c5a

22 Oct, 2021 7 commits

Fixed mysqld--help.result if password-reuse-check is compiled in static · 9624bb0f
Monty authored Oct 22, 2021

9624bb0f
MDEV-26882 InnoDB number of trx pools note improvement · 6bfaa68c
Marko Mäkelä authored Oct 22, 2021

6bfaa68c
Merge 10.6 into 10.7 · 71d4ecf1
Marko Mäkelä authored Oct 22, 2021

71d4ecf1

MDEV-26769 InnoDB does not support hardware lock elision · 1f022809

Marko Mäkelä authored Oct 22, 2021

This implements memory transaction support for:

* Intel Restricted Transactional Memory (RTM), also known as TSX-NI
(Transactional Synchronization Extensions New Instructions)
* POWER v2.09 Hardware Trace Monitor (HTM) on GNU/Linux

transactional_lock_guard, transactional_shared_lock_guard:
RAII lock guards that try to elide the lock acquisition
when transactional memory is available.

buf_pool.page_hash: Try to elide latches whenever feasible.
Related to the InnoDB change buffer and ROW_FORMAT=COMPRESSED
tables, this is not always possible.
In buf_page_get_low(), memory transactions only work reasonably
well for validating a guessed block address.

TMLockGuard, TMLockTrxGuard, TMLockMutexGuard: RAII lock guards
that try to elide lock_sys.latch and related latches.

1f022809

MDEV-26826 Duplicated computations of buf_pool.page_hash addresses · c091a0bc

Marko Mäkelä authored Oct 22, 2021

Since commit bd5a6403 (MDEV-26033)
we can actually calculate the buf_pool.page_hash cell and latch
addresses while not holding buf_pool.mutex.

buf_page_alloc_descriptor(): Remove the MEM_UNDEFINED.
We now expect buf_page_t::hash to be zero-initialized.

buf_pool_t::hash_chain: Dedicated data type for buf_pool.page_hash.array.

buf_LRU_free_one_page(): Merged to the only caller
buf_pool_t::corrupted_evict().

c091a0bc

MDEV-26828 Spinning on buf_pool.page_hash is wasting CPU cycles · fdae71f8

Marko Mäkelä authored Oct 22, 2021

page_hash_latch: Only use the spinlock implementation on
SUX_LOCK_GENERIC platforms (those for which we do not implement
a futex-like interface). Use srw_spin_mutex on 32-bit systems
(except Microsoft Windows) to satisfy the size constraints.

rw_lock::is_read_locked(): Remove. We will use the slightly
broader assertion is_locked().

srw_lock_: Implement is_locked(), is_write_locked() in a hacky
way for the Microsoft Windows SRWLOCK. This should be acceptable,
because we are only using these predicates in debug assertions
(or later, in lock elision), and false positives should not matter.

fdae71f8

MDEV-26883 InnoDB hang due to table lock conflict · 5caff202

Marko Mäkelä authored Oct 22, 2021

In a stress test campaign of a 10.6-based branch by Matthias Leich,
a deadlock between two InnoDB threads occurred, involving
lock_sys.wait_mutex and a dict_table_t::lock_mutex.

The cause of the hang is a latching order violation in
lock_sys_t::cancel(). That function and the latching order
violation were originally introduced in
commit 8d16da14 (MDEV-24789).

lock_sys_t::cancel(): Invoke table->lock_mutex_trylock() in order
to avoid a deadlock. If that fails, release lock_sys.wait_mutex,
and acquire both latches. In that way, we will be obeying the
latching order and no hangs will occur.

This hang should mostly affect DDL operations. DML operations will
acquire only IX or IS table locks, which are compatible with each other.

5caff202

21 Oct, 2021 9 commits
- Remove trailing space · 059a5f11
  Vladislav Vaintroub authored Oct 21, 2021
  
  059a5f11
- Merge 10.5 into 10.6 · 73f5cbd0
  Marko Mäkelä authored Oct 21, 2021
  
  73f5cbd0
- Fix GCC 11.2.0 -m32 (IA-32) warnings · a0fda162
  Marko Mäkelä authored Oct 21, 2021
```
page_create_low(): Fix -Warray-bounds

log_buffer_extend(): Fix -Wstringop-overflow
```
  a0fda162
- Merge 10.4 into 10.5 · 5f8561a6
  Marko Mäkelä authored Oct 21, 2021
  
  5f8561a6
- Merge 10.3 into 10.4 · 489ef007
  Marko Mäkelä authored Oct 21, 2021
  
  489ef007
- Merge 10.2 into 10.3 · d5bcccda
  Marko Mäkelä authored Oct 21, 2021
  
  d5bcccda
- MDEV-19522 fixup: Integer type mismatch in unit test · fbb1e92e
  Marko Mäkelä authored Oct 21, 2021
  
  fbb1e92e
- Merge 10.2 into 10.3 · e4a7c15d
  Marko Mäkelä authored Oct 21, 2021
  
  e4a7c15d
- MDEV-26865: Add test case and instrumentation · 1a2308d3
  Marko Mäkelä authored Oct 21, 2021
```
Based on mysql/mysql-server@bc9c46bf2894673d0df17cd0ee872d0d99663121
but without sleeps.

The test was verified to hit the debug assertion if the change to
fts_add_doc_by_id() in commit 2d98b967
was reverted.
```
  1a2308d3