Commits · 8bb2170d7420c0d8877c7d78719109290be9ecca · nexedi / MariaDB

31 Jul, 2020 6 commits

Merge 10.2 into 10.3 · 8bb2170d
Marko Mäkelä authored Jul 31, 2020

8bb2170d
Merge 10.2 into 10.3 · 66ec3a77
Marko Mäkelä authored Jul 31, 2020

66ec3a77

MDEV-11799 Doublewrite recovery can corrupt data pages · 879ba197

Marko Mäkelä authored Jul 31, 2020

The purpose of the InnoDB doublewrite buffer is to make InnoDB
tolerant against cases where the server was killed in the middle
of a page write. (In Linux, killing a process may interrupt a
write system call, typically on a 4096-byte boundary.)

There may exist multiple copies of a page number in the doublewrite
buffer. Recovery should choose the latest valid copy of the page.
By design, the FIL_PAGE_LSN must not precede the latest checkpoint LSN
nor be later than the end of the recovered log.

For page_compressed and encrypted pages, we were missing proper
consistency checks. In the 10.4 data set generated for in MDEV-23231,
the data file contained a valid page_compressed page, and an
identical copy of that page was also present in the doublewrite
buffer. But, recovery would incorrectly consider the page invalid
and restore an uncompressed copy of the same page that had been
written before the log checkpoint. (In fact, no redo log was to
be applied to that page.)

buf_dblwr_process(): Validate the FIL_PAGE_LSN in the doublewrite
buffer pages, and always skip page 0, because those pages should
have been recovered by Datafile::restore_from_doublewrite() if
necessary.

Datafile::restore_from_doublewrite(): Choose the latest applicable
page from the doublewrite buffer.

recv_dblwr_t::find_page(): Also validate encrypted or
page_compressed pages.

recv_dblwr_t::validate_page(): New function to validate a page,
either a copy in a data file or in the doublewrite buffer.
Also validate encrypted or page_compressed pages.

This is joint work with Thirunarayanan Balathandayuthapani.

879ba197

MDEV-23198 Crash in REPLACE · f35d1721

Marko Mäkelä authored Jul 31, 2020

row_vers_impl_x_locked_low(): clust_offsets may point to memory
that is allocated by mem_heap_alloc() and may have been freed.
For initializing clust_offsets, try to use the stack-allocated
buffer instead of a pointer that may point to freed memory.

This fixes a regression that was introduced in
commit f0aa073f (MDEV-20950).

f35d1721

MDEV-18042 Server crashes upon adding a non-null date column under... · fd0abc89

Nikita Malyavin authored Jul 29, 2020

MDEV-18042 Server crashes upon adding a non-null date column under NO_ZERO_DATE with ALGORITHM=INPLACE

accept table_name and db_name instead of table_share in make_truncated_value_warning

fd0abc89

MDEV-19338 InnoDB: Failing assertion: !cursor->index->is_committed() · 91ebf184

Nikita Malyavin authored Jul 28, 2020

Call mark_columns_per_binlog_row_image before find_row() to set up table->vcol_set early,
so the virtual column value will be updated after record read (ha_rnd_pos/ha_index_next/etc)
by table->update_virtual_fields() call

91ebf184

30 Jul, 2020 5 commits

MDEV-23334 Crash in rec_get_nth_cfield()/rec_offs_validate() · 6053eb1c

Marko Mäkelä authored Jul 30, 2020

rec_get_nth_cfield(): Remove a bogus debug assertion.
The function may be invoked by innobase_rec_to_mysql()
for reporting a duplicate key error during CREATE UNIQUE INDEX
or ALTER TABLE...ADD UNIQUE KEY, and in that case the record
will be missing the 5-byte or 6-byte fixed header.

It turns out that in every other code path leading to
rec_get_nth_cfield() we either invoked rec_get_offsets()
ourselves or asserted rec_offs_validate(). So, we can
safely remove the assertion and make debug builds
smaller and faster.

6053eb1c

MDEV-21101 skip test for embedded · 0435fcf9
Vladislav Vaintroub authored Jul 30, 2020

0435fcf9

MDEV-23332 Index online status assert failure in btr_search_drop_page_hash_index · 8a612314

Thirunarayanan Balathandayuthapani authored Jul 30, 2020

Problem:
========
In row_merge_drop_indexes(), InnoDB drops only the index from
dictionary and frees the index pages but it maintains the index
object if the table is being used by other DML threads. It sets
the online status of the index to ONLINE_INDEX_ABORTED_DROPPED.
Removing the index from dictionary doesn't remove the
corressponding ahi entries of the index. When block is being
reused, InnoDB tries to remove ahi entries for the block and
it fails if index online status is ONLINE_INDEX_ABORTED_DROPPED.

Fix:
====
MDEV-22456 allows the index ahi entries to be dropped lazily.
so checking online status in btr_search_drop_page_hash_index()
is meaningless and should be removed.

8a612314

MDEV-21101 unexpected wait_timeout with pool-of-threads · 71015d84

Vladislav Vaintroub authored Jul 28, 2020

Due to restricted size of the threadpool, execution of client queries can
be delayed (queued) for a while. This delay was interpreted as client
inactivity, and connection is closed, if client idle time + queue time
exceeds wait_timeout.

But users did not expect queue time to be included into wait_timeout.

This patch changes the behavior. We don't close connection anymore,
if there is some unread data present on connection,
even if wait_timeout is exceeded. Unread data means that client
was not idle, it sent a query, which we did not have time to process yet.

71015d84

MDEV-23339 innodb_force_recovery=2 may still abort the rollback of recovered transactions · c5d4dd25

Marko Mäkelä authored Jul 30, 2020

trx_rollback_active(), trx_rollback_resurrected(): Replace
an incorrect condition that we failed to replace in
commit b68f1d84 (MDEV-21217).

c5d4dd25

29 Jul, 2020 3 commits

MDEV-21258: Can't uninstall plugin if the library file doesn't exist · 2107e3bb
Oleksandr Byelkin authored Jul 28, 2020
```
Removing plugin from the mysql.plugin even if the plugin is not loaded
```
2107e3bb

speed up my_timer_init() · 8ec877f4

Eugene Kosov authored Jul 29, 2020

I run perf top during ./mtr testing and constantly see times()
function there. It's so slow, that it has no sense to run it
in a loop too many times.

This patch speeds up -suite=innodb for me from 218s to 208s.
9s of times() function!

8ec877f4

MDEV-16023 Unfortunate error message WARN_VERS_PART_FULL (partition <name> is... · 34f2be3b

Nikita Malyavin authored Jul 24, 2018

MDEV-16023 Unfortunate error message WARN_VERS_PART_FULL (partition <name> is full) when rotation time for the last interval passed

* remove one case of WARN_VERS_PART_FULL

34f2be3b

28 Jul, 2020 7 commits

MDEV-23308 CHECK TABLE attempts to access parent_right_page_no=FIL_NULL · 3c3f172f

Marko Mäkelä authored Jul 28, 2020

mysql/mysql-server@e00ad49edc8b07317b52c9efd0810f2cbc57877a
which introduced WL#6326 to MySQL 5.7.2 added a buffer page
acquisition to CHECK TABLE code (solely for the purpose of
obeying the changed latching order), but failed to check that
a parent page actually exists. It would not necessarily exist in a
corrupted index where a parent page is missing pointer records
to child pages.

3c3f172f

MDEV-20142 encryption.innodb_encrypt_temporary_tables failed in buildbot with wrong result · 6307b17a
Marko Mäkelä authored Jul 28, 2020
```
Let us read both encrypted temporary tables to increase the changes of
page flushing and eviction.
```
6307b17a
MDEV-23137: aarch64, postfix - cmake include · 940668f5
Daniel Black authored Jul 28, 2020

940668f5
MDEV-9911: NTILE must return an error when parameter is not stable · 459b87f6
Dan Solodko authored Jun 18, 2020

459b87f6

rocksdb: FreeBSD disable jemalloc search · cae4b3f8

Daniel Black authored Jun 22, 2020

FreeBSD's inbuilt default jemalloc means its pointless
to do a package search on it. The paths are already set
by the system defaults.

cae4b3f8

MDEV-23051: riscv64 fails build (atomics) · 715beee4

Daniel Black authored Jul 02, 2020

riscv64 fails to build because the use
of #include <atomic> needs to link with -latomic.

per https://github.com/riscv/riscv-gnu-toolchain/issues/183#issuecomment-253721765

715beee4

MDEV-23137: RocksDB: undefined reference to crc32c_arm64 · d88ea260

Krunal Bauskar authored Jul 27, 2020

RocksDB fails to build on arm64: undefined reference to
            `crc32c_arm64(unsigned int, unsigned char const*, unsigned int)'

MariaDB uses storage/rocksdb/build_rocksdb.cmake to compile RocksDB.
Said cmake missed adding crc32c_arm64 compilation target so if
machine native architecture supported crc32 then complier would enable
usage of function defined in crc32c_arm64 causing the listed error.

Added crc32c_arm64 complition target.

closes #1642

d88ea260

27 Jul, 2020 4 commits

MDEV-12474: rocksdb: mtr - rocksdb.concurrent_alter use sh · 186d9d0d

Daniel Black authored Jun 22, 2020

FreeBSD doesn't have bash installed by default and sh
has sufficient job control for this test.

$  mysql-test/mtr --mem --max-test-fail=30 --force --parallel=1 rocksdb.concurrent_alter
Logging: /home/dan/mariadb-server-10.5/mysql-test/mysql-test-run.pl  --mem --max-test-fail=30 --force --parallel=1 rocksdb.concurrent_alter
vardir: /usr/home/dan/build-mariadb-server-10.5/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/usr/home/dan/build-mariadb-server-10.5/mysql-test/var'...
 - symlinking 'var' to '/tmp/var_auto_P81m'
Checking supported features...
MariaDB Version 10.5.4-MariaDB
 - SSL connections supported

 - binaries built with wsrep patch
Collecting tests...
Installing system database...

==============================================================================

TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
rocksdb.concurrent_alter 'write_committed' [ pass ]  16348
rocksdb.concurrent_alter 'write_prepared' [ pass ]  16771
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 33.119 of 41 seconds executing testcases

Completed: All 2 tests were successful.

$ uname -a
FreeBSD freebsd 12.1-RELEASE-p6 FreeBSD 12.1-RELEASE-p6 GENERIC  amd64

186d9d0d

MDEV-23233: Race condition for btr_search_drop_page_hash_index() in buf_page_create() · a1f899a8

Thirunarayanan Balathandayuthapani authored Jul 24, 2020

commit ad6171b9 (MDEV-22456)
introduced code to buf_page_create() that would lazily drop
adaptive hash index entries for an index that has been
evicted from the data dictionary cache.

Unfortunately, that call was missing adequate protection.
While the btr_search_drop_page_hash_index(block) was executing,
the block could be reused for something else.

buf_page_create(): If btr_search_drop_page_hash_index() must be
invoked, pin the block before releasing the buf_pool->page_hash lock,
so that the block cannot be grabbed by other threads.

a1f899a8

Enable fixed test case. · b4c74210
Jan Lindström authored Jul 27, 2020

b4c74210

MDEV-18916: crash in Window_spec::print_partition() with decimals · a6410deb

Varun Gupta authored Jul 27, 2020

Removed an unnecessary ifndef which was printing the window name for a named
window only in the case of debug build. The print() for the window function
should behave in the same way on both release and debug builds.

a6410deb

24 Jul, 2020 4 commits

MDEV-14711 Assertion `mode == 16 || mode == 12 ||... · 5f1ec5cb

Thirunarayanan Balathandayuthapani authored Jul 23, 2020

MDEV-14711 Assertion `mode == 16 || mode == 12 || !fix_block->page.file_page_was_freed' failed in buf_page_get_gen (rollback requesting a freed undo page)

Problem:
=======
In buf_cur_optimistic_latch_leaves(), requesting a left block with BTR_GET
after releasing current block. But there is no guarantee that left block
could be still available.

Fix:
====

(1) In btr_cur_optimistic_latch_leaves(), replace the BUF_GET with
BUF_GET_POSSIBLY_FREED for fetching left block.
(2) Once InnoDB acquires left block, it should check FIL_PAGE_NEXT with
current block page number. If not, release cursor->left_block and return
false.

5f1ec5cb

MDEV-15236: galera_ist_progress fails when trying to read transfer status · a3a249c7
mkaruza authored Jul 20, 2020
```
Fixed by new  recording test recording
```
a3a249c7

MDEV-20928 mtr test galera.galera_var_innodb_disallow_writes test failure · c9928cc0

sjaakola authored Jun 23, 2020

The sporadic test hangs happen because of mutex dealock between innodb
background threads and two test connection executions.
The test sets variable innodb_disallow_writes, which blocks all writes
to filesyste. The test logic is to execute an INSERT, which should hang
because of filesytstem writes are blocked, and through another session
verify by SELECT that this hanging happens. The SELECT session will then
release innodb_disallow_writes blocking.

However, filesystem write blocking affects also innodb background threads
and they may hang while keeping some other resources locked.
As an example, in one test hang situation, buffer pool access was blocked.
And, if buffer pool is blocked, the test connections will be blocked as well,
and the SELECT session will not be able to continue to release the
innodb_disallow_writes.

The fix in this commit is refactoring of the test logic.
The test will now set first innodb_disallow_writes blocking, and then record
a hash of data directory's filesystem contents. This works as checksum of the
state of data on the datadirectory.

Then some SQL load is tried on both nodes, these sessions will be blocking
due to frozen file system state. The test will have a short sleep to allow
innodb background threads to loop and possibly encounter innodb_disallow_writes
blocking as well.

After the sleep, the test will record file system checksun for the second time,
and then release the innodb_disallow-writes blocking.

Finally, the two checksums are compared, they should be identical to verify that
nothing was written on datadirectory during the test execution.

The checksum is implemented by md5sum hash over all files found in datadirectory
by find command. all these file hashes are hashed together by one more md5sum.

The test therefore depends on md5sum and find. find may work differently with some
OS distributions, e.g. freebsd may be problematic.

c9928cc0

MDEV-18177 : Galera test failure on galera_autoinc_sst_mariabackup · ba23e6d7
Jan Lindström authored Jul 23, 2020
```
Add wait_condition
```
ba23e6d7

23 Jul, 2020 6 commits

MDEV-23134 SEGV in dict_load_table_one during restart after server crash · b3b1c51e

Thirunarayanan Balathandayuthapani authored Jul 23, 2020

Problem:
========
dict_load_table_one() doesn't handle the scenario where clustered
index page is FIL_NULL when DICT_ERR_IGNORE_INDEX_ROOT mode
is set.

Fix:
====
InnoDB should set the file_unreadable when it can't find the
clustered index root page.

b3b1c51e

MDEV-23244 ALTER TABLE…ADD PRIMARY KEY fails to flag duplicates · 1656ea28

Marko Mäkelä authored Jul 23, 2020

The fix of MDEV-13654 (commit ff81faf6)
wrongly caused ADD PRIMARY KEY to ignore duplicate PRIMARY KEY values
caused by concurrent DML transactions that had been started before the
ALTER TABLE operation (but did not access the table before the ALTER TABLE
started).

row_ins_duplicate_online(): Always report a duplicate key error
if DB_TRX_ID had been reset (it belongs to a transaction that had
started before the ALTER TABLE operation).

1656ea28

MDEV-20638 Remove the deadcode from srv_master_thread() and srv_active_wake_master_thread_low() · fe39d02f

Thirunarayanan Balathandayuthapani authored Jul 23, 2020

- Due to commit fe95cb2e (MDEV-16125),
InnoDB master thread does not need to call srv_resume_thread()
and therefore there is no need to wake up the thread.
Due to the above patch, InnoDB should remove the following dead code.

srv_check_activity(): Makes the parameter as in,out and returns the
recent activity value

innobase_active_small(): Removed

srv_active_wake_master_thread(): Removed

srv_wake_master_thread(): Removed

srv_active_wake_master_thread_low(): Removed

Simplify srv_master_thread() and remove switch cases, added the assert.

Replace srv_wake_master_thread() with srv_inc_activity_count()

INNOBASE_WAKE_INTERVAL: Removed

fe39d02f

A bit more safety · f7adc4a1
Oleksandr Byelkin authored Jul 22, 2020

f7adc4a1

MDEV-22134: handle_fatal_signal (sig=11) in __strlen_avx2 on START SLAVE |... · 0ec641ea

Oleksandr Byelkin authored Jul 21, 2020

MDEV-22134: handle_fatal_signal (sig=11) in __strlen_avx2 on START SLAVE | Assertion `global_system_variables.session_track_system_variables' failed in Session_sysvars_tracker::init | *** buffer overflow detected *** (on optimized builds)

Prohibit assigning NULL as for other system variables.

0ec641ea

MDEV-14203: rpl.rpl_extra_col_master_myisam,... · b3dd95e0

Sujatha authored Jul 23, 2020

MDEV-14203: rpl.rpl_extra_col_master_myisam, rpl.rpl_slave_load_tmpdir_not_exist failed in buildbot with a warning

Problem:
=======
rpl.rpl_slave_load_tmpdir_not_exist 'stmt' w3 [ fail ] Found warnings/errors
in server log file!

Test ended at 2017-09-27 20:34:55
[Warning] Master is configured to log replication events with checksum, but
will not send such events to slaves that cannot process them
^ Found warnings in /mnt/buildbot/build/mariadb-10.2.10/mysql-test/var/3/log/mysqld.1.err
ok
Analysis:
========
When slave tries to connect to master 'get_master_version_and_clock' function
is invoked to perform elaborated slave-master handshake. During this process
slave server queries master server, to know if it is checksum aware and at the
same time master is notified about its CRC-awareness. The master's side
instant value of @@global.binlog_checksum is stored in the dump thread's
uservar area as well as cached locally to become known in consensus by master
and slave.

Post hand-shake slave requests master for binlog dump. It sends
'COM_BINLOG_DUMP'. This command is sent to master by 'cli_advanced_command'
call. If there is some temporary network failure during this request_dump
call, 'end_server' is invoked to close the current connection between master
and slave. Upon connection close the dump thread on the master gets terminated
and it clears the 'uservar' data it got through master-slave handshake.

The 'COM_BINLOG_DUMP' command is sent once again without master-slave
handshake. Since the checksum data is not available with new dump thread a
warning gets reported.

Fix:
===
Upon network write error donot attempt reconnect, proceed to master-slave
handshake. This ensures that master is aware of slave's capability to use
checksums.

b3dd95e0

22 Jul, 2020 4 commits

MDEV-17481 mariadb service won't shutdown when it's running and the OS datetime updated backwards · 3a8943ae

Thirunarayanan Balathandayuthapani authored Jul 21, 2020

__pthread_cond_timedwait() in page cleaner hangs if os time moved
backwards.Workaround could be waking up the page cleaner thread in
logs_empty_and_mark_files_at_shutdown(). But there is possibility that
server could hang when server is running. So InnoDB should wake up page
cleaner thread periodically in srv_master_do_idle_tasks().

3a8943ae

MDEV-13830 Assertion failed: recv_sys->mlog_checkpoint_lsn <= recv_sys->recovered_lsn · 2a3bc0b9

Thirunarayanan Balathandayuthapani authored Jul 21, 2020

There can be multiple MLOG_CHECKPOINT record for the same checkpoint.
During recovery, InnoDB could encounter the previous MLOG_CHECKPOINT
for the checkpoint lsn. So the assertion
mlog_checkpoint_lsn <= recovered_lsn is wrong.

2a3bc0b9

MDEV-23108: Point in time recovery of binary log fails when sql_mode=ORACLE · c86accc7

Sujatha authored Jul 20, 2020

Problem:
========
During point in time recovery of binary log syntax error is reported for
BEGIN statement and recovery fails.

Analysis:
=========
In MariaDB 10.3 and later, setting the sql_mode system variable to Oracle
allows the server to understand a subset of Oracle's PL/SQL language. When
sql_mode=ORACLE is set, it switches the parser from the MariaDB parser to
Oracle compatible parser. With this change 'BEGIN' is not considered as
'START TRANSACTION'. Hence the syntax error is reported.

Fix:
===
At preset 'BEGIN' query is generated from 'Gtid_log_event::print'. The current
session specific 'sql_mode' information is not present as part of
'Gtid_log_event'. If it was available then, mysqlbinlog tool can make use of
'sql_mode == ORACLE' and can output "START TRANSACTION" in this particular
mode and for other sql_modes it will write "BEGIN" as part of output. Since it
is not available 'mysqlbinlog' tool will output all 'BEGIN' statements as
'START TRANSACTION' irrespective of 'sql_mode'.

c86accc7

fix assertion · 6898eae7
Nikita Malyavin authored Jul 22, 2020

6898eae7

21 Jul, 2020 1 commit
- fix c++98 build · ebca70ea
  Nikita Malyavin authored Jul 21, 2020
  
  ebca70ea