Commits · 2534b5cb995aeca18cd7cf4db63cddff5f2f6788 · nexedi / MariaDB

An error occurred fetching the project authors.

20 Dec, 2017 1 commit

MDEV-14717 RENAME TABLE in InnoDB is not crash-safe · 0bc36758

Marko Mäkelä authored 7 years ago

InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2
redo log records during table-rebuilding ALGORITHM=INPLACE operations.
We must write the records for any .ibd file renames, so that the
operations are crash-safe.

If InnoDB is killed during a RENAME TABLE operation, it can happen that
the transaction for updating the data dictionary will be rolled back.
But, nothing will roll back the renaming of the .ibd file
(the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter,
the renaming of the dict_table_t::name in the dict_sys cache. We introduce
the undo log record TRX_UNDO_RENAME_TABLE to fix this.

fil_space_for_table_exists_in_mem(): Remove the parameters
adjust_space, table_id and some code that was trying to work around
these deficiencies.

fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record.

dict_table_rename_in_cache(): Invoke fil_name_write_rename().

trx_undo_rec_copy(): Set the first 2 bytes to the length of the
copied undo log record.

trx_undo_page_report_rename(), trx_undo_report_rename():
Write a TRX_UNDO_RENAME_TABLE record with the old table name.

row_rename_table_for_mysql(): Invoke trx_undo_report_rename()
before modifying any data dictionary tables.

row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE
by invoking dict_table_rename_in_cache(), which will take care
of both renaming the table and the file.

0bc36758

08 Dec, 2017 1 commit

Remove space_name_list_t · 39450498

Marko Mäkelä authored 7 years ago

fil_get_space_names(): Remove.

fts_drop_orphaned_tables(): Iterate fil_system->space_list directly.

39450498

20 Nov, 2017 1 commit

MDEV-14310 Possible corruption by table-rebuilding or index-creating ALTER TABLE…ALGORITHM=INPLACE · ce64a65f

Marko Mäkelä authored 7 years ago

Also, MDEV-14317 When ALTER TABLE is aborted, do not write garbage pages to data files

As pointed out by Shaohua Wang, the merge of MDEV-13328 from
MariaDB 10.1 (based on MySQL 5.6) to 10.2 (based on 5.7)
was performed incorrectly.

Let us always pass a non-NULL FlushObserver* when writing
to data files is desired.

FlushObserver::is_partial_flush(): Check if this is a bulk-load
(partial flush of the tablespace).

FlushObserver::is_interrupted(): Check for interrupt status.

buf_LRU_flush_or_remove_pages(): Instead of trx_t*, take
FlushObserver* as a parameter.

buf_flush_or_remove_pages(): Remove the parameters flush, trx.
If observer!=NULL, write out the data pages. Use the new predicate
observer->is_partial() to distinguish a partial tablespace flush
(after bulk-loading) from a full tablespace flush (export).
Return a bool (whether all pages were removed from the flush_list).

buf_flush_dirty_pages(): Remove the parameter trx.

ce64a65f

06 Nov, 2017 1 commit

MDEV-13328 ALTER TABLE…DISCARD TABLESPACE takes a lot of time · 51b4366b

Marko Mäkelä authored 7 years ago

With a big buffer pool that contains many data pages,
DISCARD TABLESPACE took a long time, because it would scan the
entire buffer pool to remove any pages that belong to the tablespace.
With a large buffer pool, this would take a lot of time, especially
when the table-to-discard is empty.

The minimum amount of work that DISCARD TABLESPACE must do is to
remove the pages of the to-be-discarded table from the
buf_pool->flush_list because any writes to the data file must be
prevented before the file is deleted.

If DISCARD TABLESPACE does not evict the pages from the buffer pool,
then IMPORT TABLESPACE must do it, because we must prevent pre-DISCARD,
not-yet-evicted pages from being mistaken for pages of the imported
tablespace.

It would not be a useful fix to simply move the buffer pool scan to
the IMPORT TABLESPACE step. What we can do is to actively evict those
pages that could be mistaken for imported pages. In this way, when
importing a small table into a big buffer pool, the import should
still run relatively fast.

Import is bypassing the buffer pool when reading pages for the
adjustment phase. In the adjustment phase, if a page exists in
the buffer pool, we could replace it with the page from the imported
file. Unfortunately I did not get this to work properly, so instead
we will simply evict any matching page from the buffer pool.

buf_page_get_gen(): Implement BUF_EVICT_IF_IN_POOL, a new mode
where the requested page will be evicted if it is found. There
must be no unwritten changes for the page.

buf_remove_t: Remove. Instead, use trx!=NULL to signify that a write
to file is desired, and use a separate parameter bool drop_ahi.

buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Replace buf_remove_t.

buf_LRU_remove_pages(), buf_LRU_remove_all_pages(): Remove.

PageConverter::m_mtr: A dummy mini-transaction buffer

PageConverter::PageConverter(): Complete the member initialization list.

PageConverter::operator()(): Evict any 'shadow' pages from the
buffer pool so that pre-existing (garbage) pages cannot be mistaken
for pages that exist in the being-imported file.

row_discard_tablespace(): Remove a bogus comment that seems to
refer to IMPORT TABLESPACE, not DISCARD TABLESPACE.

51b4366b

17 Oct, 2017 1 commit
- MDEV-14082 Enforcing innodb_open_files leads to fil_system->mutex problem · a6310222
  Marko Mäkelä authored 7 years ago
```
fil_mutex_enter_and_prepare_for_io(): Reacquire fil_system->mutex after
failing to close a file and before retrying.
```
  a6310222
11 Oct, 2017 1 commit

Fix a warning about unused variable · 18a979df

Marko Mäkelä authored 7 years ago

fil_ibd_create(): commit b731a5bc
introduced a variable that was unused outside Windows. Use it
on all platforms.

18a979df

10 Oct, 2017 1 commit

Innodb : Refactor os_file_set_size() to be compatible 10.1 · b731a5bc

Vladislav Vaintroub authored 7 years ago

The last parameter to this function is now,"bool is_sparse", like in 10.1
rather than the  unused/useless "bool is_readonly", merged from MySQL 5.7

Like in 10.1, this function now supports sparse files, and efficient
platform specific mechanisms for file extension

os_file_set_size() is now consistenly used in all places where
innodb files are extended.

b731a5bc

07 Oct, 2017 1 commit
- Refactor os_file_set_size to extend already existing files. · 420798a8
  Vladislav Vaintroub authored 7 years ago
```
Change fil_space_extend_must_retry() to use this function.
```
  420798a8
03 Oct, 2017 1 commit

MDEV-13901 Assertion `!space->stop_new_ops' failed in TRUNCATE TABLE with many indexes · b7162312

Marko Mäkelä authored 7 years ago

fil_space_extend_must_retry(): If the table is being truncated,
do not call fil_flush_low(). The operation is covered by the
truncate log. File extension during TRUNCATE only occurs
if there are many indexes on the table. With smaller innodb_page_size,
the file extension occurs already with fewer indexes on the table.

b7162312

29 Sep, 2017 1 commit

MDEV-13941 Fix high NTFS fragmentation on 10.2 · 96b9c617

Vladislav Vaintroub authored 7 years ago

Prior to this patch, creating or even opening any innodb file in 10.2
would set a sparse flag on file. The file extension was done by setting
end of file, without writing zeros. This technique is fine, however
due to sparsedness, it created a hole at the end of the file, which
lead to much higher fragmentation subsequently.

The fix is only to use sparse flag for compressed tables, where holes
are actually wanted, but not for normal tables.

96b9c617

31 Aug, 2017 3 commits

MDEV-12741: innodb.ibuf_not_empty failed in buildbot with "InnoDB: Trying to... · aa22981d

Jan Lindström authored 7 years ago

MDEV-12741: innodb.ibuf_not_empty failed in buildbot with "InnoDB: Trying to do I/O to a tablespace which does not exist"

Background thread is doing ibuf merge, in buf0rea.cc buf_read_ibuf_merge_pages().
It first tries to get page_size and if space is not found it deletes them, but
as we do not hold any mutexes, space can be marked as stopped between that
and buf_read_page_low() for same space. This naturally leads seen error
message on log.

buf_read_page_low(): Add parameter ignore_missing_space = false that
is passed to fil_io()

buf_read_ibuf_merge_pages(): call buf_read_page_low with
ignore_missing_space = true, this function will handle missing
space error code after buf_read_page_low returns.

fil_io(): if ignore_missing_space = true do not print error
message about trying to do I/0 for missing space, just return
correct error code that is handled later.

aa22981d

Add ATTRIBUTE_NORETURN and ATTRIBUTE_COLD · 4386ee8c

Marko Mäkelä authored 7 years ago

ATTRIBUTE_NORETURN is supported on all platforms (MSVS and GCC-like).
It declares that a function will not return; instead, the thread or
the whole process will terminate.

ATTRIBUTE_COLD is supported starting with GCC 4.3. It declares that
a function is supposed to be executed rarely. Rarely used error-handling
functions and functions that emit messages to the error log should be
tagged such.

4386ee8c

MDEV-13557: Startup failure, unable to decrypt ibdata1 · eca238ae

Jan Lindström authored 7 years ago

Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.

Fixes also MDEV-13093: Leak of Datafile::m_crypt_info on
shutdown after failed startup.

Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.

Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.

xb_load_single_table_tablespace(): Do not call
fil_space_destroy_crypt_data() any more, because Datafile::m_crypt_data
has been removed.

fil_crypt_realloc_iops(): Avoid divide by zero.

fil_crypt_set_thread_cnt(): Set fil_crypt_threads_event if
encryption threads exist. This is required to find tablespaces
requiring key rotation if no other changes happen.

fil_crypt_find_space_to_rotate(): Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.

fil_ibd_open(), fil_ibd_load(): Load possible crypt_data from first
page.

class Datafile, class SysTablespace : remove m_crypt_info field.

Datafile::get_first_page(): Return a pointer to first page buffer.

fsp_header_init(): Write encryption metadata to page 0 only if
tablespace is encrypted or encryption is disabled by table option.

i_s_dict_fill_tablespaces_encryption(): Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.

eca238ae

29 Aug, 2017 1 commit

MDEV-13557: Startup failure, unable to decrypt ibdata1 · 352d27ce

Jan Lindström authored 7 years ago

Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.

Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.

Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.

fil_crypt_set_thread_cnt: Send message to background encryption threads
if they exits when they are ready. This is required to find tablespaces
requiring key rotation if no other changes happen.

fil_crypt_find_space_to_rotate: Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.

fsp_header_init: Write encryption metadata to page 0 only if tablespace is
encrypted or encryption is disabled by table option.

i_s_dict_fill_tablespaces_encryption : Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.

open_or_create_data_files: Do not create encryption metadata
by default to system tablespace.

352d27ce

28 Aug, 2017 1 commit

MDEV-13591: InnoDB: Database page corruption on disk or a failed file read and assertion failure · 61096ff2

Jan Lindström authored 7 years ago

Problem is that page 0 and its possible enrryption information
is not read for undo tablespaces.

fil_crypt_get_latest_key_version(): Do not send event to
encryption threads if event does not yet exists. Seen
on regression testing.

fil_read_first_page: Add new parameter does page belong to
undo tablespace and if it does, we do not read FSP_HEADER.

srv_undo_tablespace_open : Read first page of the tablespace
to get crypt_data if it exists and pass it to fil_space_create.

Tested using innodb_encryption with combinations with
innodb-undo-tablespaces.

61096ff2

10 Aug, 2017 1 commit

Fix some GCC 7 warnings for InnoDB · bfffe571

Marko Mäkelä authored 7 years ago

buf_page_io_complete(): Do not test bpage for NULL, because
it is declared (and always passed) as nonnull.

buf_flush_batch(): Remove the constant local variable count=0.

fil_ibd_load(): Use magic comment to suppress -Wimplicit-fallthrough.

ut_stage_alter_t::inc(ulint): Disable references to an unused parameter.

lock_queue_validate(), sync_array_find_thread(), rbt_check_ordering():
Define only in debug builds.

bfffe571

09 Aug, 2017 1 commit

Bug #25357789 INNODB: LATCH ORDER VIOLATION DURING TRUNCATE TABLE IF INNODB_SYNC_DEBUG ENABLED · 9d57468d

Thirunarayanan Balathandayuthapani authored 7 years ago

Analysis:
========

(1) During TRUNCATE of file_per_table tablespace, dict_operation_lock is
released before eviction of dirty pages of a tablespace from the buffer
pool. After eviction, we try to re-acquire
dict_operation_lock (higher level latch) but we already hold lower
level latch (index->lock). This causes latch order violation

(2) Deadlock issue is present if child table is being truncated and it
holds index lock. At the same time, cascade dml happens and it took
dict_operation_lock and waiting for index lock.

Fix:
====
1) Release the indexes lock before releasing the dict operation lock.

2) Ignore the cascading dml operation on the parent table, for the
cascading foreign key, if the child table is truncated or if it is
in the process of being truncated.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Kevin Lewis <kevin.lewis@oracle.com>
RB: 16122

9d57468d

05 Jul, 2017 3 commits

MDEV-13105 InnoDB fails to load a table with PAGE_COMPRESSION_LEVEL after upgrade from 10.1.20 · e555540a

Marko Mäkelä authored 7 years ago

When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.

fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.

e555540a

MDEV-13105 InnoDB fails to load a table with PAGE_COMPRESSION_LEVEL after upgrade from 10.1.20 · e3d31477

Marko Mäkelä authored 7 years ago

When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.

fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.

e3d31477

MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2 · 8c71c6aa

Marko Mäkelä authored 7 years ago

InnoDB I/O and buffer pool interfaces and the redo log format
have been changed between MariaDB 10.1 and 10.2, and the backup
code has to be adjusted accordingly.

The code has been simplified, and many memory leaks have been fixed.
Instead of the file name xtrabackup_logfile, the file name ib_logfile0
is being used for the copy of the redo log. Unnecessary InnoDB startup and
shutdown and some unnecessary threads have been removed.

Some help was provided by Vladislav Vaintroub.

Parameters have been cleaned up and aligned with those of MariaDB 10.2.

The --dbug option has been added, so that in debug builds,
--dbug=d,ib_log can be specified to enable diagnostic messages
for processing redo log entries.

By default, innodb_doublewrite=OFF, so that --prepare works faster.
If more crash-safety for --prepare is needed, double buffering
can be enabled.

The parameter innodb_log_checksums=OFF can be used to ignore redo log
checksums in --backup.

Some messages have been cleaned up.
Unless --export is specified, Mariabackup will not deal with undo log.
The InnoDB mini-transaction redo log is not only about user-level
transactions; it is actually about mini-transactions. To avoid confusion,
call it the redo log, not transaction log.

We disable any undo log processing in --prepare.

Because MariaDB 10.2 supports indexed virtual columns, the
undo log processing would need to be able to evaluate virtual column
expressions. To reduce the amount of code dependencies, we will not
process any undo log in prepare.

This means that the --export option must be disabled for now.

This also means that the following options are redundant
and have been removed:
	xtrabackup --apply-log-only
	innobackupex --redo-only

In addition to disabling any undo log processing, we will disable any
further changes to data pages during --prepare, including the change
buffer merge. This means that restoring incremental backups should
reliably work even when change buffering is being used on the server.
Because of this, preparing a backup will not generate any further
redo log, and the redo log file can be safely deleted. (If the
--export option is enabled in the future, it must generate redo log
when processing undo logs and buffered changes.)

In --prepare, we cannot easily know if a partial backup was used,
especially when restoring a series of incremental backups. So, we
simply warn about any missing files, and ignore the redo log for them.

FIXME: Enable the --export option.

FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write
a test that initiates a backup while an ALGORITHM=INPLACE operation
is creating indexes or rebuilding a table. An error should be detected
when preparing the backup.

FIXME: In --incremental --prepare, xtrabackup_apply_delta() should
ensure that if FSP_SIZE is modified, the file size will be adjusted
accordingly.

8c71c6aa

15 Jun, 2017 2 commits

MDEV-12873 InnoDB SYS_TABLES.TYPE incompatibility for PAGE_COMPRESSED=YES in... · 72378a25

Marko Mäkelä authored 7 years ago

MDEV-12873 InnoDB SYS_TABLES.TYPE incompatibility for PAGE_COMPRESSED=YES in MariaDB 10.2.2 to 10.2.6

Remove the SHARED_SPACE flag that was erroneously introduced in
MariaDB 10.2.2, and shift the SYS_TABLES.TYPE flags back to where
they were before MariaDB 10.2.2. While doing this, ensure that
tables created with affected MariaDB versions can be loaded,
and also ensure that tables created with MySQL 5.7 using the
TABLESPACE attribute cannot be loaded.

MariaDB 10.2.2 picked the SHARED_SPACE flag from MySQL 5.7,
shifting the MariaDB 10.1 flags PAGE_COMPRESSION, PAGE_COMPRESSION_LEVEL,
ATOMIC_WRITES by one bit. The SHARED_SPACE flag would always
be written as 0 by MariaDB, because MariaDB does not support
CREATE TABLESPACE or CREATE TABLE...TABLESPACE for InnoDB.

So, instead of the bits AALLLLCxxxxxxx we would have
AALLLLC0xxxxxxx if the table was created with MariaDB 10.2.2
to 10.2.6. (AA=ATOMIC_WRITES, LLLL=PAGE_COMPRESSION_LEVEL,
C=PAGE_COMPRESSED, xxxxxxx=7 bits that were not moved.)

PAGE_COMPRESSED=NO implies LLLLC=00000. That is not a problem.

If someone created a table in MariaDB 10.2.2 or 10.2.3 with
the attribute ATOMIC_WRITES=OFF (value 2; AA=10) and without
PAGE_COMPRESSED=YES or PAGE_COMPRESSION_LEVEL, the table should be
rejected. We ignore this problem, because it should be unlikely
for anyone to specify ATOMIC_WRITES=OFF, and because 10.2.2 and
10.2.2 were not mature releases. The value ATOMIC_WRITES=ON (1)
would be interpreted as ATOMIC_WRITES=OFF, but starting with
MariaDB 10.2.4 the ATOMIC_WRITES attribute is ignored.

PAGE_COMPRESSED=YES implies that PAGE_COMPRESSION_LEVEL be between
1 and 9 and that ROW_FORMAT be COMPACT or DYNAMIC. Thus, the affected
wrong bit pattern in SYS_TABLES.TYPE is of the form AALLLL10DB00001
where D signals the presence of a DATA DIRECTORY attribute and B is 1
for ROW_FORMAT=DYNAMIC and 0 for ROW_FORMAT=COMPACT. We must interpret
this bit pattern as AALLLL1DB00001 (discarding the extraneous 0 bit).

dict_sys_tables_rec_read(): Adjust the affected bit pattern when
reading the SYS_TABLES.TYPE column. In case of invalid flags,
report both SYS_TABLES.TYPE (after possible adjustment) and
SYS_TABLES.MIX_LEN.

dict_load_table_one(): Replace an unreachable condition on
!dict_tf2_is_valid() with a debug assertion. The flags will already
have been validated by dict_sys_tables_rec_read(); if that validation
fails, dict_load_table_low() will have failed.

fil_ibd_create(): Shorten an error message about a file pre-existing.

Datafile::validate_to_dd(): Clarify an error message about tablespace
flags mismatch.

ha_innobase::open(): Remove an unnecessary warning message.

dict_tf_is_valid(): Simplify and stricten the logic. Validate the
values of PAGE_COMPRESSION. Remove error log output; let the callers
handle that.

DICT_TF_BITS: Remove ATOMIC_WRITES, PAGE_ENCRYPTION, PAGE_ENCRYPTION_KEY.
The ATOMIC_WRITES is ignored once the SYS_TABLES.TYPE has been validated;
there is no need to store it in dict_table_t::flags. The PAGE_ENCRYPTION
and PAGE_ENCRYPTION_KEY are unused since MariaDB 10.1.4 (the GA release
was 10.1.8).

DICT_TF_BIT_MASK: Remove (unused).

FSP_FLAGS_MEM_ATOMIC_WRITES: Remove (the flags are never read).

row_import_read_v1(): Display an error if dict_tf_is_valid() fails.

72378a25

Remove some fields from dict_table_t · 58f87a41

Marko Mäkelä authored 7 years ago

dict_table_t::thd: Remove. This was only used by btr_root_block_get()
for reporting decryption failures, and it was only assigned by
ha_innobase::open(), and never cleared. This could mean that if a
connection is closed, the pointer would become stale, and the server
could crash while trying to report the error. It could also mean
that an error is being reported to the wrong client. It is better
to use current_thd in this case, even though it could mean that if
the code is invoked from an InnoDB background operation, there would
be no connection to which to send the error message.

Remove dict_table_t::crypt_data and dict_table_t::page_0_read.
These fields were never read.

fil_open_single_table_tablespace(): Remove the parameter "table".

58f87a41

09 Jun, 2017 1 commit

MDEV-12610: MariaDB start is slow · 58c56dd7

Jan Lindström authored 7 years ago

Problem appears to be that the function fsp_flags_try_adjust()
is being unconditionally invoked on every .ibd file on startup.
Based on performance investigation also the top function
fsp_header_get_crypt_offset() needs to addressed.

Ported implementation of fsp_header_get_encryption_offset()
function from 10.2 to fsp_header_get_crypt_offset().

Introduced a new function fil_crypt_read_crypt_data()
to read page 0 if it is not yet read.

fil_crypt_find_space_to_rotate(): Now that page 0 for every .ibd
file is not read on startup we need to check has page 0 read
from space that we investigate for key rotation, if it is not read
we read it.

fil_space_crypt_get_status(): Now that page 0 for every .ibd
file is not read on startup here also we need to read page 0
if it is not yet read it. This is needed
as tests use IS query to wait until background encryption
or decryption has finished and this function is used to
produce results.

fil_crypt_thread(): Add is_stopping condition for tablespace
so that we do not rotate pages if usage of tablespace should
be stopped. This was needed for failure seen on regression
testing.

fil_space_create: Remove page_0_crypt_read and extra
unnecessary info output.

fil_open_single_table_tablespace(): We call fsp_flags_try_adjust
only when when no errors has happened and server was not started
on read only mode and tablespace validation was requested or
flags contain other table options except low order bits to
FSP_FLAGS_POS_PAGE_SSIZE position.

fil_space_t::page_0_crypt_read removed.

Added test case innodb-first-page-read to test startup when
encryption is on and when encryption is off to check that not
for all tables page 0 is read on startup.

58c56dd7

08 Jun, 2017 1 commit

Cleanup of MDEV-12600: crash during install_db with innodb_page_size=32K and ibdata1=3M · fbeb9489

Marko Mäkelä authored 7 years ago

The doublewrite buffer pages must fit in the first InnoDB system
tablespace data file. The checks that were added in the initial patch
(commit 112b21da)
were at too high level and did not cover all cases.

innodb.log_data_file_size: Test all innodb_page_size combinations.

fsp_header_init(): Never return an error. Move the change buffer creation
to the only caller that needs to do it.

btr_create(): Clean up the logic. Remove the error log messages.

buf_dblwr_create(): Try to return an error on non-fatal failure.
Check that the first data file is big enough for creating the
doublewrite buffers.

buf_dblwr_process(): Check if the doublewrite buffer is available.
Display the message only if it is available.

recv_recovery_from_checkpoint_start_func(): Remove a redundant message
about FIL_PAGE_FILE_FLUSH_LSN mismatch when crash recovery has already
been initiated.

fil_report_invalid_page_access(): Simplify the message.

fseg_create_general(): Do not emit messages to the error log.

innobase_init(): Revert the changes.

trx_rseg_create(): Refactor (no functional change).

fbeb9489

01 Jun, 2017 2 commits

MDEV-12600: crash during install_db with innodb_page_size=32K and ibdata1=3M; · 112b21da

Jan Lindström authored 7 years ago

Problem was that all doublewrite buffer pages must fit to first
system datafile.

Ported commit 27a34df7882b1f8ed283f22bf83e8bfc523cbfde
Author: Shaohua Wang <shaohua.wang@oracle.com>
Date:   Wed Aug 12 15:55:19 2015 +0800

    BUG#21551464 - SEGFAULT WHILE INITIALIZING DATABASE WHEN
    INNODB_DATA_FILE SIZE IS SMALL

To 10.1 (with extended error printout).

btr_create(): If ibuf header page allocation fails report error and
return FIL_NULL. Similarly if root page allocation fails return a error.

dict_build_table_def_step: If fsp_header_init fails return
error code.

fsp_header_init: returns true if header initialization succeeds
and false if not.

fseg_create_general: report error if segment or page allocation fails.

innobase_init: If first datafile is smaller than 3M and could not
contain all doublewrite buffer pages report error and fail to
initialize InnoDB plugin.

row_truncate_table_for_mysql: report error if fsp header init
fails.

srv_init_abort: New function to report database initialization errors.

srv_undo_tablespaces_init, innobase_start_or_create_for_mysql: If
database initialization fails report error and abort.

trx_rseg_create: If segment header creation fails return.

112b21da

MDEV-12113: install_db shows corruption for rest encryption with innodb_data_file_path=ibdata1:3M; · 1af8bf39

Jan Lindström authored 7 years ago

Problem was that FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION field that for
encrypted pages even in system datafiles should contain key_version
except very first page (0:0) is after encryption overwritten with
flush lsn.

Ported WL#7990 Repurpose FIL_PAGE_FLUSH_LSN to 10.1
The field FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION is consulted during
InnoDB startup.

At startup, InnoDB reads the FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
from the first page of each file in the InnoDB system tablespace.
If there are multiple files, the minimum and maximum LSN can differ.
These numbers are passed to InnoDB startup.

Having the number in other files than the first file of the InnoDB
system tablespace is not providing much additional value. It is
conflicting with other use of the field, such as on InnoDB R-tree
index pages and encryption key_version.

This worklog will stop writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION to
other files than the first file of the InnoDB system tablespace
(page number 0:0) when system tablespace is encrypted. If tablespace
is not encrypted we continue writing FIL_PAGE_FLUSH_LSN_OR_KEY_VERSION
to all first pages of system tablespace to avoid unnecessary
warnings on downgrade.

open_or_create_data_files(): pass only one flushed_lsn parameter

xb_load_tablespaces(): pass only one flushed_lsn parameter.

buf_page_create(): Improve comment about where
FIL_PAGE_FIL_FLUSH_LSN_OR_KEY_VERSION is set.

fil_write_flushed_lsn(): A new function, merged from
fil_write_lsn_and_arch_no_to_file() and
fil_write_flushed_lsn_to_data_files().
Only write to the first page of the system tablespace (page 0:0)
if tablespace is encrypted, or write all first pages of system
tablespace and invoke fil_flush_file_spaces(FIL_TYPE_TABLESPACE)
afterwards.

fil_read_first_page(): read flush_lsn and crypt_data only from
first datafile.

fil_open_single_table_tablespace(): Remove output of LSN, because it
was only valid for the system tablespace and the undo tablespaces, not
user tablespaces.

fil_validate_single_table_tablespace(): Remove output of LSN.

checkpoint_now_set(): Use fil_write_flushed_lsn and output
a error if operation fails.

Remove lsn variable from fsp_open_info.

recv_recovery_from_checkpoint_start(): Remove unnecessary second
flush_lsn parameter.

log_empty_and_mark_files_at_shutdown(): Use fil_writte_flushed_lsn
and output error if it fails.

open_or_create_data_files(): Pass only one flushed_lsn variable.

1af8bf39

29 May, 2017 1 commit
- MDEV-11623 merge fix: Use the correct flags in an error message · 22e5e64c
  Marko Mäkelä authored 7 years ago
  
  22e5e64c
20 May, 2017 1 commit

MDEV-12615: InnoDB page compression method snappy mostly does not compress pages · 90c52e52

Jan Lindström authored 7 years ago

Snappy compression method require that output buffer
used for compression is bigger than input buffer.
Similarly lzo require additional work memory buffer.
Increase the allocated buffer accordingly.

buf_tmp_buffer_t: removed unnecessary lzo_mem, crypt_buf_free and
comp_buf_free.

buf_pool_reserve_tmp_slot: use alligned_alloc and if snappy
available allocate size based on snappy_max_compressed_length and
if lzo is available increase buffer by LZO1X_1_15_MEM_COMPRESS.

fil_compress_page: Remove unneeded lzo mem (we use same buffer)
and if output buffer is not yet allocated allocate based similarly
as above.

Decompression does not require additional work area.

    Modify test to use same test as other compression method tests.

90c52e52

17 May, 2017 3 commits

fil_create_new_single_table_tablespace(): Correct a bogus nonnull attribute · e22d86a3
Marko Mäkelä authored 7 years ago
```
The parameter path can be passed as NULL.
This error was reported by GCC 7.1.0 when compiling
CMAKE_BUILD_TYPE=Debug with -O3.
```
e22d86a3

Remove redundant UT_LIST_INIT() calls · 956d2540

Marko Mäkelä authored 7 years ago

The macro UT_LIST_INIT() zero-initializes the UT_LIST_NODE.
There is no need to call this macro on a buffer that has
already been zero-initialized by mem_zalloc() or mem_heap_zalloc()
or similar.

For some reason, the statement UT_LIST_INIT(srv_sys->tasks) in
srv_init() caused a SIGSEGV on server startup when compiling with
GCC 7.1.0 for AMD64 using -O3. The zero-initialization was attempted
by the instruction movaps %xmm0,0x50(%rax), while the proper offset
of srv_sys->tasks would seem to have been 0x48.

956d2540

Make some variables const in fil_iterate() · febe8819

Marko Mäkelä authored 7 years ago

This is a non-functional change to make it slightly easier
to read the code. We seem to have some bugs in this
IMPORT TABLESPACE code; see MDEV-12396.

febe8819

15 May, 2017 1 commit
- 5.6.36 · 0af98182
  Vicențiu Ciorbaru authored 7 years ago
  
  0af98182
10 May, 2017 1 commit

Fix some integer type mismatch. · 021d6365

Marko Mäkelä authored 7 years ago

Use uint32_t for the encryption key_id.

When filling unsigned integer values into INFORMATION_SCHEMA tables,
use the method Field::store(longlong, bool unsigned)
instead of using Field::store(double).

Fix also some miscellanous type mismatch related to ulint (size_t).

021d6365

09 May, 2017 1 commit

MDEV-12750 Fix crash recovery of key rotation · 588a6a18

Marko Mäkelä authored 7 years ago

When MySQL 5.7.9 was merged to MariaDB 10.2.2, an important
debug assertion was omitted from mlog_write_initial_log_record_low().

mlog_write_initial_log_record_low(): Put back the assertion
mtr_t::is_named_space().

fil_crypt_start_encrypting_space(), fil_crypt_rotate_page():
Call mtr_t::set_named_space() before modifying any pages.

fsp_flags_try_adjust(): Call mtr_t::set_named_space(). This additional
breakage was introduced in the merge of MDEV-11623 from 10.1. It was
not caught because of the missing debug assertion in
mlog_write_initial_log_record_low().

Remove some suppressions from the encryption.innodb-redo-badkey test.

588a6a18

06 May, 2017 2 commits

MDEV-11520 Properly retry posix_fallocate() on EINTR · 6c91be54

Marko Mäkelä authored 7 years ago

We only want to retry posix_fallocate() on EINTR as long as the system
is not being shut down. We do not want to retry on any other (hard) error.

Thanks to Jocelyn Fournier for quickly noticing the mistake in my
previous commit.

6c91be54

MDEV-11520 post-fix: Retry posix_fallocate() on EINTR in fil_ibd_create() · baad0f34
Marko Mäkelä authored 7 years ago
```
Earlier versions of MariaDB only use posix_fallocate() when extending
data files, not when initially creating the files,
```
baad0f34

28 Apr, 2017 2 commits

MDEV-12602 InnoDB: Failing assertion: space->n_pending_ops == 0 · b82c602d

Marko Mäkelä authored 7 years ago

This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.

fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.

fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.

Adjust a number of places accordingly, and remove some redundant
tablespace lookups.

The following parts of this fix differ from the 10.2 version of this fix:

buf_page_get_corrupt(): Add a tablespace parameter.

In 10.2, we already had a two-phase process of freeing fil_space objects
(first, fil_space_detach(), then release fil_system->mutex, and finally
free the fil_space and fil_node objects).

fil_space_free_and_mutex_exit(): Renamed from fil_space_free().
Detach the tablespace from the fil_system cache, release the
fil_system->mutex, and then wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread.
During the wait, future calls to fil_space_acquire_for_io() will
not find this tablespace, and the count can only be decremented to 0,
at which point it is safe to free the objects.

fil_node_free_part1(), fil_node_free_part2(): Refactored from
fil_node_free().

b82c602d

MDEV-12602 InnoDB: Failing assertion: space->n_pending_ops == 0 · 4b24467f

Marko Mäkelä authored 7 years ago

This fixes a regression caused by MDEV-12428.
When we introduced a variant of fil_space_acquire() that could
increment space->n_pending_ops after space->stop_new_ops was set,
the logic of fil_check_pending_operations() was broken.

fil_space_t::n_pending_ios: A new field to track read or write
access from the buffer pool routines immediately before a block
write or after a block read in the file system.

fil_space_acquire_for_io(), fil_space_release_for_io(): Similar
to fil_space_acquire_silent() and fil_space_release(), but
modify fil_space_t::n_pending_ios instead of fil_space_t::n_pending_ops.

fil_space_free_low(): Wait for space->n_pending_ios to reach 0,
to avoid accessing freed data in a concurrent thread. Future
calls to fil_space_acquire_for_io() will not find this tablespace,
because it will already have been detached from fil_system.

Adjust a number of places accordingly, and remove some redundant
tablespace lookups.

FIXME: buf_page_check_corrupt() should take a tablespace from
fil_space_acquire_for_io() as a parameter. This will be done
in the 10.1 version of this patch and merged from there.
That depends on MDEV-12253, which has not been merged from 10.1 yet.

4b24467f

26 Apr, 2017 2 commits

Bug #24793413 LOG PARSING BUFFER OVERFLOW · 2ef1baa7

Thirunarayanan Balathandayuthapani authored 8 years ago

Problem:
========
During checkpoint, we are writing all MLOG_FILE_NAME records in one mtr
and parse buffer can't be processed till MLOG_MULTI_REC_END. Eventually parse
buffer exceeds the RECV_PARSING_BUF_SIZE and eventually it overflows.

Fix:
===
1) Break the large mtr if it exceeds LOG_CHECKPOINT_FREE_PER_THREAD into multiple mtr during checkpoint.
2) Move the parsing buffer if we are encountering only MLOG_FILE_NAME
records. So that it will never exceed the RECV_PARSING_BUF_SIZE.
Reviewed-by: Debarun Bannerjee <debarun.bannerjee@oracle.com>
Reviewed-by: Rahul M Malik <rahul.m.malik@oracle.com>
RB: 14743

2ef1baa7

MariaDB adjustments for Oracle Bug#23070734 fix · 849af74a

Marko Mäkelä authored 7 years ago

Split the test case so that a server restart is not needed.
Reduce the test cases and use a simpler mechanism for triggering
and waiting for purge.

fil_table_accessible(): Check if a table can be accessed without
enjoying MDL protection.

849af74a