Commits · 2c226e01a88e18ae8bcc7f584ac6c9c37fb800b0 · nexedi / MariaDB

14 Dec, 2020 3 commits

MDEV-24313 fixup: GCC -Wparentheses · 2c226e01
Marko Mäkelä authored Dec 14, 2020

2c226e01

MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1 · f24b7383

Marko Mäkelä authored Dec 14, 2020

In commit 5e62b6a5 (MDEV-16264)
the logic of os_aio_init() was changed so that it will never fail,
but instead automatically disable innodb_use_native_aio (which is
enabled by default) if the io_setup() system call would fail due
to resource limits being exceeded. This is questionable, especially
because falling back to simulated AIO may lead to significantly
reduced performance.

srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads:
Change the data type from ulong to uint.

os_aio_init(): Remove the parameters, and actually return an error code.

thread_pool::configure_aio(): Do not silently fall back to simulated AIO.

Reviewed by: Vladislav Vaintroub

f24b7383

MDEV-24313 (1 of 2): Hang with innodb_write_io_threads=1 · 17d3f856

Marko Mäkelä authored Dec 14, 2020

After commit a5a2ef07 (part of MDEV-23855)
implemented asynchronous doublewrite, it is possible that the server will
hang when the following parametes are in effect:

    innodb_doublewrite=1 (default)
    innodb_write_io_threads=1
    innodb_use_native_aio=0

Note: In commit 5e62b6a5 (MDEV-16264)
the logic of os_aio_init() was changed so that it will never fail,
but instead automatically disable innodb_use_native_aio (which is
enabled by default) if the io_setup() system call would fail due
to resource limits being exceeded.

Before commit a5a2ef07, we used
a synchronous write for the doublewrite buffer batches, always at
most 64 pages at a time. So, upon completing a doublewrite batch,
a single thread would submit at most 64 page writes (for the
individual pages that were first written to the doublewrite buffer).
With that commit, we may submit up to 128 page writes at a time.

The maximum number of outstanding requests per thread is 256.
Because the maximum number of asynchronous write submissions per
thread was roughly doubled, it is now possible that
buf_dblwr_t::flush_buffered_writes_completed() will hang in
io_slots::acquire(), called via os_aio() and fil_space_t::io(),
when submitting writes of the individual blocks.

We will prevent this type of hang by increasing the minimum number
of innodb_write_io_threads from 1 to 2, so that this type of hang
would only become possible when 512 outstanding write requests
are exceeded.

17d3f856

11 Dec, 2020 1 commit

MDEV-24391 heap-use-after-free in fil_space_t::flush_low() · 8677c14e

Marko Mäkelä authored Dec 11, 2020

We observed a race condition that involved two threads
executing fil_flush_file_spaces() and one thread
executing fil_delete_tablespace(). After one of the
fil_flush_file_spaces() observed that
space.needs_flush_not_stopping() is set and was
releasing the fil_system.mutex, the other fil_flush_file_spaces()
would complete the execution of fil_space_t::flush_low() on
the same tablespace. Then, fil_delete_tablespace() would
destroy the object, because the value of fil_space_t::n_pending
did not prevent that. Finally, the fil_flush_file_spaces() would
resume execution and invoke fil_space_t::flush_low() on the freed
object.

This race condition was introduced in
commit 118e258a of MDEV-23855.

fil_space_t::flush(): Add a template parameter that indicates
whether the caller is holding a reference to prevent the
tablespace from being freed.

buf_dblwr_t::flush_buffered_writes_completed(),
row_quiesce_table_start(): Acquire a reference for the duration
of the fil_space_t::flush_low() operation. It should be impossible
for the object to be freed in these code paths, but we want to
satisfy the debug assertions.

fil_space_t::flush_low(): Do not increment or decrement the
reference count, but instead assert that the caller is holding
a reference.

fil_space_extend_must_retry(), fil_flush_file_spaces():
Acquire a reference before releasing fil_system.mutex.
This is what will fix the race condition.

8677c14e

09 Dec, 2020 2 commits

Remove unused DBUG_EXECUTE_IF "ignore_punch_hole" · 0c7c4492

Marko Mäkelä authored Dec 09, 2020

Since commit ea21d630 we
conditionally define a variable that only plays a role on
systems that support hole-punching (explicit creation of sparse files).
However, that broke debug builds on such systems.

It turns out that the debug_dbug label "ignore_punch_hole" is
not at all used in MariaDB server. It would be covered by
the MySQL 5.7 test innodb.table_compress. (Note: MariaDB 10.1
implemented page_compressed tables before something comparable
appeared in MySQL 5.7.)

0c7c4492

MDEV-12227 Defer writes to the InnoDB temporary tablespace · 5eb53955

Marko Mäkelä authored Dec 09, 2020

The flushing of the InnoDB temporary tablespace is unnecessarily
tied to the write-ahead redo logging and redo log checkpoints,
which must be tied to the page writes of persistent tablespaces.

Let us simply omit any pages of temporary tables from buf_pool.flush_list.
In this way, log checkpoints will never incur any 'collateral damage' of
writing out unmodified changes for temporary tables.

After this change, pages of the temporary tablespace can only be written
out by buf_flush_lists(n_pages,0) as part of LRU eviction. Hopefully,
most of the time, that code will never be executed, and instead, the
temporary pages will be evicted by buf_release_freed_page() without
ever being written back to the temporary tablespace file.

This should improve the efficiency of the checkpoint flushing and
the buf_flush_page_cleaner thread.

Reviewed by: Vladislav Vaintroub

5eb53955

08 Dec, 2020 3 commits

Fix -Wunused-but-set-variable · ea21d630
Marko Mäkelä authored Dec 08, 2020

ea21d630

MDEV-24369 Page cleaner sleeps despite innodb_max_dirty_pages_pct_lwm being exceeded · f0c295e2

Marko Mäkelä authored Dec 08, 2020

MDEV-24278 improved the page cleaner so that it will no longer wake up
once per second on an idle server. However, with innodb_adaptive_flushing
(the default) the function page_cleaner_flush_pages_recommendation()
could initially return 0 even if there is work to do.

af_get_pct_for_dirty(): Remove. Based on a comment here, it appears
that an initial intention of innodb_max_dirty_pages_pct_lwm=0.0
(the default value) was to disable something. That ceased to hold in
MDEV-23855: the value is a pure threshold; the page cleaner will not
perform any work unless the threshold is exceeded.

page_cleaner_flush_pages_recommendation(): Add the parameter dirty_blocks
to ensure that buf_pool.flush_list will eventually be emptied.

f0c295e2

MDEV-24351: S3, same-backend replication: Dropping a table on master... · 6859e80d

Sergei Petrunia authored Dec 08, 2020

..causes error on slave.
Cause: if the master doesn't have the frm file for the table,
DROP TABLE code will call ha_delete_table_force() to drop the table
in all available storage engines.
The issue was that this code path didn't check for
HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE flag for the storage engine,
and so did not add "... IF EXISTS" to the statement that's written
to the binary log.  This can cause error on the slave when it tries to
drop a table that's already gone.

6859e80d

07 Dec, 2020 1 commit
- Simplify clang workarounds. · 3ee24b23
  Vladislav Vaintroub authored Dec 07, 2020
  
  3ee24b23
04 Dec, 2020 2 commits

MDEV-24350 buf_dblwr unnecessarily uses memory-intensive srv_stats counters · 83591a23

Marko Mäkelä authored Dec 04, 2020

The counters in srv_stats use std::atomic and multiple cache lines per
counter. This is an overkill in a case where a critical section already
exists in the code. A regular variable will work just fine, with much
smaller memory bus impact.

83591a23

MDEV-24348 InnoDB shutdown hang with innodb_flush_sync=0 · aa0e3805

Marko Mäkelä authored Dec 04, 2020

This hang was caused by MDEV-23855, and we failed to fix it in
MDEV-24109 (commit 4cbfdeca).

When buf_flush_ahead() is invoked soon before server shutdown
and the non-default setting innodb_flush_sync=OFF is in effect
and the buffer pool contains dirty pages of temporary tables,
the page cleaner thread may remain in an infinite loop
without completing its work, thus causing the shutdown to hang.

buf_flush_page_cleaner(): If the buffer pool contains no
unmodified persistent pages, ensure that buf_flush_sync_lsn= 0
will be assigned, so that shutdown will proceed.

The test case is not deterministic. On my system, it reproduced
the hang with 95% probability when running multiple instances
of the test in parallel, and 4% when running single-threaded.

Thanks to Eugene Kosov for debugging and testing this.

aa0e3805

03 Dec, 2020 2 commits

Fixed usage of not initialized memory in LIKE ... ESCAPE · 6033cc85

Monty authored Dec 03, 2020

This was noticed wben running "mtr --valgrind main.precedence"

The problem was that Item_func_like::escape could be left unitialized
when used with views combined with UNIONS like in:

create or replace view v1 as select 2 LIKE 1 ESCAPE 3 IN (SELECT 0 UNION SELECT 1), 2 LIKE 1 ESCAPE (3 IN (SELECT 0 UNION SELECT 1)), (2 LIKE 1 ESCAPE 3) IN (SELECT 0 UNION SELECT 1);

The above query causes in fix_escape_item()
escape_item->const_during_execution() to be true
and
escape_item->const_item() to be false

in which case 'escape' is never calculated.

The fix is to make the main logic of fix_escape_item() out to a
separate function and call that function once in Item.

Other things:
- Reorganized fields in Item_func_like class to make it more compact

6033cc85

MDEV-22929 fixup: root_name() clash with clang++ <fstream> · f146969f

Marko Mäkelä authored Dec 03, 2020

The clang++ -stdlib=libc++ header file <fstream> depends on
<filesystem> that defines a member function path::root_name(),
which conflicts with the rather unused #define root_name()
that had been introduced in
commit 7c58e97b.

Because an instrumented -stdlib=libc++ (rather than the default
-stdlib=libstdc++) is easier to build for a working -fsanitize=memory
(cmake -DWITH_MSAN=ON), let us remove the conflicting #define for now.

f146969f

02 Dec, 2020 5 commits
- MDEV-24295: Fix the non-clang build · f3a58ed8
  Marko Mäkelä authored Dec 02, 2020
```
Sorry, only tested commit 4174fc1a
on clang. Other compilers do not define __has_feature().
```
  f3a58ed8
- MDEV-24295: Fix the WITH_MSAN build · 4174fc1a
  Marko Mäkelä authored Dec 02, 2020
```
For some reason, commit 5bb5d4ad
made clang++-11 unhappy about a constexpr declaration.
```
  4174fc1a
- MDEV-20051 fixup: Correct galera.galera_defaults result · 9b725f9a
  Marko Mäkelä authored Dec 02, 2020
```
For some reason, the test was never adjusted for
commit e6a50e41.
```
  9b725f9a
- Merge 10.4 into 10.5 · 6a1e655c
  Marko Mäkelä authored Dec 02, 2020
  
  6a1e655c
- MDEV-15532 after-merge fixes from Monty · 24ec8eaf
  Marko Mäkelä authored Dec 02, 2020
```
The Galera tests were massively failing with debug assertions.
```
  24ec8eaf
01 Dec, 2020 7 commits

Merge 10.3 into 10.4 · 589cf8db
Marko Mäkelä authored Dec 01, 2020

589cf8db
MDEV-22929 MariaBackup option to report and/or continue when corruption is encountered · e30a05f4
Vlad Lesin authored Dec 01, 2020
```
Post-push Windows compilation errors fix.
```
e30a05f4

After merge fixes · 7edfed63

Monty authored Dec 01, 2020

Change thd->mdl_context.release_transactional_locks() to
thd->mdl_release_transactional_locks()

7edfed63

MDEV-24323 Crash on recovery after kill during instant ADD COLUMN · 73f34336
Marko Mäkelä authored Dec 01, 2020
```
row_undo_ins_parse_undo_rec(): Do not try to read non-existing
virtual column information for the metadata record.
```
73f34336
Merge 10.2 into 10.3 · 81ab9ea6
Marko Mäkelä authored Dec 01, 2020

81ab9ea6
MDEV-21962 fixup: Remove buf_pool_contains_zip() · e76e1288
Marko Mäkelä authored Dec 01, 2020
```
The replacement is buf_pool.contains_zip().
```
e76e1288

MDEV-22929 MariaBackup option to report and/or continue when corruption is encountered · e6b3e38d

Vlad Lesin authored Aug 20, 2020

The new option --log-innodb-page-corruption is introduced.

When this option is set, backup is not interrupted if innodb corrupted
page is detected. Instead it logs all found corrupted pages in
innodb_corrupted_pages file in backup directory and finishes with error.

For incremental backup corrupted pages are also copied to .delta file,
because we can't do LSN check for such pages during backup,
innodb_corrupted_pages will also be created in incremental backup
directory.

During --prepare, corrupted pages list is read from the file just after
redo log is applied, and each page from the list is checked if it is allocated
in it's tablespace or not. If it is not allocated, then it is zeroed out,
flushed to the tablespace and removed from the list. If all pages are removed
from the list, then --prepare is finished successfully and
innodb_corrupted_pages file is removed from backup directory. Otherwise
--prepare is finished with error message and innodb_corrupted_pages contains
the list of the pages, which are detected as corrupted during backup, and are
allocated in their tablespaces, what means backup directory contains corrupted
innodb pages, and backup can not be considered as consistent.

For incremental --prepare corrupted pages from .delta files are applied
to the base backup, innodb_corrupted_pages is read from both base in
incremental directories, and the same action is proceded for corrupted
pages list as for full --prepare. innodb_corrupted_pages file is
modified or removed only in base directory.

If DDL happens during backup, it is also processed at the end of backup
to have correct tablespace names in innodb_corrupted_pages.

e6b3e38d

30 Nov, 2020 11 commits

MDEV 15532 Assertion `!log->same_pk' failed in row_log_table_apply_delete · 828471cb

Monty authored Nov 30, 2020

The reason for the failure is that
thd->mdl_context.release_transactional_locks()
was called after commit & rollback even in cases where the current
transaction is still active.

For 10.2, 10.3 and 10.4 the fix is simple:
- Replace all calls to thd->mdl_context.release_transactional_locks() with
  thd->release_transactional_locks(). The thd function will only call
  the mdl_context function if there are no active transactional locks.
  In 10.6 we will better fix where we will change the return value for
  some trans_xxx() functions to indicate if transaction did close the
  transaction or not. This will avoid the need of the indirect call.

Other things:
- trans_xa_commit() and trans_xa_rollback() will automatically
  call release_transactional_locks() if the transaction is closed.
- We can't do that for the other functions as the caller of many of these
  are doing additional work (like close_thread_tables) before calling
  release_transactional_locks().
- Added missing abort_result_set() and missing DBUG_RETURN in
  select_create::send_eof()
- Fixed wrong indentation in injector::transaction::commit()

828471cb

Fixed maria.create test · c5375764
Monty authored Nov 30, 2020

c5375764

MDEV-15532 Assertion `!log->same_pk' failed in row_log_table_apply_delete · a3531775

Monty authored Nov 30, 2020

The real fix for MDEV-15532 will be pushed into 10.2 and 10.6
This is an additional fix for 10.4.

In 10.4 trans_xa_detach was introduced.  However THD::cleanup() assumes
that after trans_xa_detach() is done, there is no registered transactions
anymore. In the 10.2 patch there will be an assert to ensure this, which
will cause 10.4 to fail.

The fix used is to reset the transaction flags in trans_xa_detach().

a3531775

Fixed maria.create test · 6261b1f4
Monty authored Nov 30, 2020

6261b1f4

Clarify some comments. · 1435f35b

Vladislav Vaintroub authored Nov 27, 2020

- the intention for my_getevents syscall is now better explained,
why are we using it (to be able to interrupt io_getevents syscall via
io_destroy()).

- Fix comment for MAX_EVENTS in getevent_thread_routine.
MAX_EVENTS is more of less arbitrary constant, chosen such that events array
is big enough to get multiple simultaneous io completions, but small
enough so it does not blow the thread's stack.

1435f35b

MDEV-24295 Reduce wakeups by tpool maintenance timer, when server is idle · 5bb5d4ad

Vladislav Vaintroub authored Nov 26, 2020

If maintenance timer does not do much for prolonged time, it will
wake up less frequently, once every 4 seconds instead of once every 0.4
second.

It will wakeup more often if thread creation is throttled, to avoid stalls.

5bb5d4ad

Disable mysqldump-system.test if auth socket plugin is not dynamic · 37352c4b
Monty authored Nov 27, 2020

37352c4b
Make LEX::print support single-table DELETE. · 11196347
Sergei Petrunia authored Nov 30, 2020

11196347

MDEV-24308: Revert for Windows · e34e53b5

Marko Mäkelä authored Nov 30, 2020

For some reason, InnoDB debug tests on Windows fail due to rw_lock_t
if the function call overhead for some os_thread_ code is removed.

This change worked fine on Windows in combination with MDEV-24142.

e34e53b5

MDEV-21265: IN predicate conversion to IN subquery should be allowed for a... · b4379df5

Varun Gupta authored Nov 27, 2020

MDEV-21265: IN predicate conversion to IN subquery should be allowed for a broader set of datatype comparison

Allow materialization strategy when collations on the
inner and outer sides of an IN subquery are the same and the
character set of the inner side is a proper subset of the character
set on the outer side.
This allows conversion from utf8mb3 to utf8mb4
as the former is a subset of the later.
This is only allowed when IN predicate is converted to an IN subquery

Backported part of the patch (d6a00d9b) of MDEV-17905.

b4379df5

MDEV-24308: Remove some os_thread_ functions · 8fa6e363

Marko Mäkelä authored Nov 30, 2020

os_thread_pf(): Remove.

os_thread_eq(), os_thread_yield(), os_thread_get_curr_id():
Define as macros.

ut_print_timestamp(), ut_sprintf_timestamp(): Simplify.

8fa6e363

27 Nov, 2020 1 commit

MDEV-24242 Query returns wrong result while using big_tables=1 · b92391d5

Igor Babaev authored Nov 24, 2020

When executing set operations in a pipeline using only one temporary table
additional scans of intermediate results may be needed. The scans are
performed with usage of the rnd_next() handler function that might
leave record buffers used for the temporary table not in a state that
is good for following writes into the table. For example it happens for
aria engine when the last call of rnd_next() encounters only deleted
records. Thus a cleanup of record buffers is needed after each such scan
of the temporary table.

Approved by Oleksandr Byelkin <sanja@mariadb.com>

b92391d5

26 Nov, 2020 2 commits
- Fixed compiler warnings from crc32c.cc · 1555c6d1
  Monty authored Nov 26, 2020
  
  1555c6d1
- Avoid some DBUG prints from idle server in thread pool · 279b5f87
  Monty authored Nov 24, 2020
  
  279b5f87