Commits · ffb38c9771bc7248482dc6a3d2f26ca10fe09a9b · nexedi / MariaDB

04 Jan, 2017 4 commits

MDEV-8139 Fix scrubbing tests · ffb38c97

Marko Mäkelä authored Jan 04, 2017

encryption.innodb_scrub: Clean up. Make it also cover ROW_FORMAT=COMPRESSED,
removing the need for encryption.innodb_scrub_compressed.
Add a FIXME comment saying that we should create a secondary index, to
demonstrate that also undo log pages get scrubbed. Currently that is
not working!

Also clean up encryption.innodb_scrub_background, but keep it disabled,
because the background scrubbing does not work reliably.

Fix both tests so that if something is not scrubbed, the test will be
aborted, so that the data files will be preserved. Allow the tests to
run on Windows as well.

ffb38c97

MDEV-11638 Encryption causes race conditions in InnoDB shutdown · 719321e7

Marko Mäkelä authored Jan 04, 2017

InnoDB shutdown failed to properly take fil_crypt_thread() into account.
The encryption threads were signalled to shut down together with other
non-critical tasks. This could be much too early in case of slow shutdown,
which could need minutes to complete the purge. Furthermore, InnoDB
failed to wait for the fil_crypt_thread() to actually exit before
proceeding to the final steps of shutdown, causing the race conditions.

Furthermore, the log_scrub_thread() was shut down way too early.
Also it should remain until the SRV_SHUTDOWN_FLUSH_PHASE.

fil_crypt_threads_end(): Remove. This would cause the threads to
be terminated way too early.

srv_buf_dump_thread_active, srv_dict_stats_thread_active,
lock_sys->timeout_thread_active, log_scrub_thread_active,
srv_monitor_active, srv_error_monitor_active: Remove a race condition
between startup and shutdown, by setting these in the startup thread
that creates threads, not in each created thread. In this way, once the
flag is cleared, it will remain cleared during shutdown.

srv_n_fil_crypt_threads_started, fil_crypt_threads_event: Declare in
global rather than static scope.

log_scrub_event, srv_log_scrub_thread_active, log_scrub_thread():
Declare in static rather than global scope. Let these be created by
log_init() and freed by log_shutdown().

rotate_thread_t::should_shutdown(): Do not shut down before the
SRV_SHUTDOWN_FLUSH_PHASE.

srv_any_background_threads_are_active(): Remove. These checks now
exist in logs_empty_and_mark_files_at_shutdown().

logs_empty_and_mark_files_at_shutdown(): Shut down the threads in
the proper order. Keep fil_crypt_thread() and log_scrub_thread() alive
until SRV_SHUTDOWN_FLUSH_PHASE, and check that they actually terminate.

719321e7

Part 1 of MDEV-8139 Fix scrubbing tests · 0f8e17af

Marko Mäkelä authored Jan 04, 2017

Port a bug fix from MySQL 5.7, so that all undo log pages will be freed
during a slow shutdown. We cannot scrub pages that are left allocated.

commit 173e171c6fb55f064eea278c76fbb28e2b1c757b
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date:   Fri Sep 9 18:01:27 2016 +0530

    Bug #24450908   UNDO LOG EXISTS AFTER SLOW SHUTDOWN

    Problem:
    ========

    1) cached undo segment is not removed from rollback segment history
    (RSEG_HISTORY) during slow shutdown. In other words, If the segment is
    not completely free, we are failing to remove an entry from the history
    list. While starting the server, we traverse all rollback segment slots
    history list and make it as list of undo logs to be purged in purge
    queue.
    In that case, purge queue will never be empty after slow shutdown.

    2) Freeing of undo log segment is linked with removing undo log header
    from history.

    Fix:
    ====
    1) Have separate logic of removing the undo log header from
    history list from rollback segment slots and remove it from
    rollback segment history even though it is not completely free.
Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
    RB:13672

0f8e17af

Merge 10.0 into 10.1 · 0c1de94d
Marko Mäkelä authored Jan 04, 2017

0c1de94d

03 Jan, 2017 5 commits

MDEV-11694 InnoDB tries to create unused table SYS_ZIP_DICT · 80d5d145

Marko Mäkelä authored Jan 03, 2017

MariaDB Server 10.0.28 and 10.1.19 merged code from Percona XtraDB
that introduced support for compressed columns. Much but not all
of this code was disabled by placing #ifdef HAVE_PERCONA_COMPRESSED_COLUMNS
around it.

Among the unused but not disabled code is code to access
some new system tables related to compressed columns.

The creation of these system tables SYS_ZIP_DICT and SYS_ZIP_DICT_COLS
would cause a crash in --innodb-read-only mode when upgrading
from an earlier version to 10.0.28 or 10.1.19.

Let us remove all the dead code related to compressed columns.
Users who already upgraded to 10.0.28 and 10.1.19 will have the two
above mentioned empty tables in their InnoDB system tablespace.
Subsequent versions of MariaDB Server will completely ignore those tables.

80d5d145

Post-fix for MDEV-11688 fil_crypt_threads_end() tries to create threads · ba8198a3
Marko Mäkelä authored Jan 03, 2017
```
fil_crypt_threads_cleanup(): Do nothing if nothing was initialized.
```
ba8198a3

MDEV-11688 fil_crypt_threads_end() tries to create threads · fc779252

Marko Mäkelä authored Jan 03, 2017

after aborted InnoDB startup

This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.

fc779252

MDEV-7955 WSREP() appears on radar in OLTP RO · b4616c40

Sachin Setiya authored Jan 03, 2017

This commit is for optimizing WSREP(thd) macro.

#define WSREP(thd) \
  (WSREP_ON && wsrep && (thd && thd->variables.wsrep_on))

In this we can safely remove wsrep and thd. We are not removing WSREP_ON
because this will change WSREP(thd) behaviour.

Patch Credit:- Nirbhay Choubay, Sergey Vojtovich

b4616c40

MDEV-11016 wsrep_node_is_ready() check is too strict · d9a1a201

Sachin Setiya authored Jan 03, 2017

Problem:-
  The condition that checks for node readiness is too strict as it does
  not allow SELECTs even if these selects do not access any tables.
    For example,if we run
       SELECT 1;
    OR
       SELECT @@max_allowed_packet;
Solution:-
  We need not to report this error when all_tables(lex->query_tables)
  is NULL:

d9a1a201

01 Jan, 2017 3 commits

MDEV-10100 main.pool_of_threads fails sporadically in buildbot · 3871477c

Elena Stepanova authored Jan 01, 2017

The patch fixes two test failures:
- on slow builders, sometimes a connection attempt which should
  fail due to the exceeded number of thread_pool_max_threads
  actually succeeds;
- on even slow builders, MTR sometimes cannot establish the
  initial connection, and check-testcase fails prior to the
  test start

The problem with check-testcase was caused by connect-timeout=2
which was set for all clients in the test config file. On slow
builders it might be not enough.
There is no way to override it for the pre-test check, so it needed
to be substantially increased or removed.

The other problem was caused by a race condition between sleeps
that the test performs in existing connections and the connect
timeout for the connection attempt which was expected to fail.
If sleeps finished before the connect-timeout was exceeded, it
would allow the connection to succeed.

To solve each problem without making the other one worse,
connect-timeout should be configured dynamically during the test.
Due to the nature of the test (all connections must be busy
at the moment when we need to change the timeout, and cannot execute
SET GLOBAL ...), it needs to be done independently from the server.

The solution:
- recognize 'connect_timeout' as a connection option in mysqltest's
  "connect" command;
- remove connect-timeout from the test configuration file;
- use the new connect_timeout option for those connections which
  are expected to fail;
- re-arrange the test flow to allow running a huge SLEEP
  without affecting the test execution time (because it would be
  interrupted after the main test flow is finished).

The test is still subject to false negatives, e.g. if the connection
fails due to timeout rather than due to the exceeded number of
allowed threads, or if the connection on extra port succeeds due
to a race condition and not because the special logic for the extra
port. But those false negatives have always been possible there
on slow builders, they should not be critical because faster builders
should catch such failures if they appear.

3871477c

MDEV-11636 Extra persistent columns on slave always gets NULL in RBR · d02a77bc

Sachin Setiya authored Dec 27, 2016

Problem:- In replication if slave has extra persistent column then these
column are not computed while applying write-set from master.

Solution:- While applying row events from server, we will generate values
for extra persistent columns.

d02a77bc

MDEV-11636 Extra persistent columns on slave always gets NULL in RBR · 2f5670dc

Sachin Setiya authored Dec 27, 2016

Problem:- In replication if slave has extra persistent column then these
column are not computed while applying write-set from master.

Solution:- While applying row events from server, we will generate values
for extra persistent columns.

2f5670dc

30 Dec, 2016 2 commits

MDEV-11556 InnoDB redo log apply fails to adjust data file sizes · 8451e090

Marko Mäkelä authored Dec 28, 2016

fil_space_t::recv_size: New member: recovered tablespace size in pages;
0 if no size change was read from the redo log,
or if the size change was implemented.

fil_space_set_recv_size(): New function for setting space->recv_size.

innodb_data_file_size_debug: A debug parameter for setting the system
tablespace size in recovery even when the redo log does not contain
any size changes. It is hard to write a small test case that would
cause the system tablespace to be extended at the critical moment.

recv_parse_log_rec(): Note those tablespaces whose size is being changed
by the redo log, by invoking fil_space_set_recv_size().

innobase_init(): Correct an error message, and do not require a larger
innodb_buffer_pool_size when starting up with a smaller innodb_page_size.

innobase_start_or_create_for_mysql(): Allow startup with any initial
size of the ibdata1 file if the autoextend attribute is set. Require
the minimum size of fixed-size system tablespaces to be 640 pages,
not 10 megabytes. Implement innodb_data_file_size_debug.

open_or_create_data_files(): Round the system tablespace size down
to pages, not to full megabytes, (Our test truncates the system
tablespace to more than 800 pages with innodb_page_size=4k.
InnoDB should not imagine that it was truncated to 768 pages
and then overwrite good pages in the tablespace.)

fil_flush_low(): Refactored from fil_flush().

fil_space_extend_must_retry(): Refactored from
fil_extend_space_to_desired_size().

fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if
fil_space_set_recv_size() was called.

The test case has been successfully run with all the
innodb_page_size values 4k, 8k, 16k, 32k, 64k.

8451e090

Make the test work with any innodb_page_size. · f493e395
Marko Mäkelä authored Dec 29, 2016

f493e395

28 Dec, 2016 3 commits

MDEV-11584: GRANT inside an SP does not work well on 2nd execution · 23cc1be2
Oleksandr Byelkin authored Dec 21, 2016
```
Allocate password hash in statment memory
```
23cc1be2

MDEV-11656: 'Data structure corruption' IMPORT TABLESPACE doesn't work for... · 283e9cf4

Jan Lindström authored Dec 28, 2016

MDEV-11656: 'Data structure corruption' IMPORT TABLESPACE doesn't work for encrypted InnoDB tables if space_id changed

Problem was that for encryption we use temporary scratch area for
reading and writing tablespace pages. But if page was not really
decrypted the correct updated page was not moved to scratch area
that was then written. This can happen e.g. for page 0 as it is
newer encrypted even if encryption is enabled and as we write
the contents of old page 0 to tablespace it contained naturally
incorrect space_id that is then later noted and error message
was written. Updated page with correct space_id was lost.

If tablespace is encrypted we use additional
temporary scratch area where pages are read
for decrypting readptr == crypt_io_buffer != io_buffer.

Destination for decryption is a buffer pool block
block->frame == dst == io_buffer that is updated.
Pages that did not require decryption even when
tablespace is marked as encrypted are not copied
instead block->frame is set to src == readptr.

If tablespace was encrypted we copy updated page to
writeptr != io_buffer. This fixes above bug.

For encryption we again use temporary scratch area
writeptr != io_buffer == dst
that is then written to the tablespace

(1) For normal tables src == dst ==  writeptr
ut_ad(!encrypted && !page_compressed ?
	src == dst && dst == writeptr + (i * size):1);
(2) For page compressed tables src == dst == writeptr
ut_ad(page_compressed && !encrypted ?
	src == dst && dst == writeptr + (i * size):1);
(3) For encrypted tables src != dst != writeptr
ut_ad(encrypted ?
	src != dst && dst != writeptr + (i * size):1);

283e9cf4

MDEV-9282 Debian: the Lintian complains about "shlib-calls-exit" in ha_innodb.so · d50cf42b

Marko Mäkelä authored Dec 28, 2016

Replace all exit() calls in InnoDB with abort() [possibly via ut_a()].
Calling exit() in a multi-threaded program is problematic also for
the reason that other threads could see corrupted data structures
while some data structures are being cleaned up by atexit() handlers
or similar.

In the long term, all these calls should be replaced with something
that returns an error all the way up the call stack.

d50cf42b

27 Dec, 2016 2 commits

Replication tests fail on valgrind due to waiting-related timeouts · dc9f5dfc

Elena Stepanova authored Dec 27, 2016

MTR raises default wait_for_pos_timeout from 300 to 1500 when tests
are run with valgrind. The same needs to be done for other
replication-related waits

dc9f5dfc

Disable the test for valgrind builds · 37f294fe

Elena Stepanova authored Dec 27, 2016

Test is very slow with valgrind, and pointless because it is
initially about a race condition which is hardly achievable
with valgrind

37f294fe

22 Dec, 2016 3 commits

Remove an unnecessary comparison. · 545c9126
Marko Mäkelä authored Dec 22, 2016

545c9126

MDEV-11630 Call mutex_free() before freeing the mutex list · 7e02fd1f

Marko Mäkelä authored Dec 22, 2016

Make some global fil_crypt_ variables static.

fil_close(): Call mutex_free(&fil_system->mutex) also in InnoDB, not
only in XtraDB. In InnoDB, sync_close() was called before fil_close().

innobase_shutdown_for_mysql(): Call fil_close() before sync_close(),
similar to XtraDB shutdown.

fil_space_crypt_cleanup(): Call mutex_free() to pair with
fil_space_crypt_init().

fil_crypt_threads_cleanup(): Call mutex_free() to pair with
fil_crypt_threads_init().

7e02fd1f

MDEV-11218: encryption.innodb_encryption_discard_import failed in buildbot · 55eb7120
Jan Lindström authored Dec 22, 2016
```
Try to stabilize test cases. These test behave badly when run in certain order.
```
55eb7120

21 Dec, 2016 3 commits

Fixed compiler warning · c51c885d
Monty authored Dec 21, 2016

c51c885d

MDEV-7558 analyze_stmt_slow_query_log fails sporadically in buildbot · c33c638f

Monty authored Dec 21, 2016

The reason was that the test was reusing the same log file without deleting it between tests.
Fixed by creating a new log file as part of the test

c33c638f

MDEV-11490 Galera_3nodes test suite does not suppress Warnings. · 9e032d61

Sachin Setiya authored Dec 21, 2016

Problem:- While running individual tests of Galera_3nodes ,
We get warnings like '[Warning] WSREP: Could not open state file
for reading: '. And because of this individual tests fails.

Solution:- We change suite.pm of Galera_3nodes to supress these warnings.

9e032d61

20 Dec, 2016 2 commits
- Fix failing galera tests. · 75ab65ae
  Nirbhay Choubey authored Dec 20, 2016
  
  75ab65ae
- Port the test innodb.doublewrite from MySQL 5.7. · 195241e1
  Marko Mäkelä authored Dec 20, 2016
  
  195241e1
19 Dec, 2016 2 commits

Merge branch '10.0' into 10.1 · 44da95e5
Marko Mäkelä authored Dec 19, 2016

44da95e5

MDEV-11602 InnoDB leaks foreign key metadata on DDL operations · 9f863a15

Marko Mäkelä authored Dec 19, 2016

Essentially revert MDEV-6759, which addressed a double free of memory
by removing the freeing altogether, introducing the memory leaks.
No double free was observed when running the test suite -DWITH_ASAN.

Replace some mem_heap_free(foreign->heap) with dict_foreign_free(foreign)
so that the calls can be located and instrumented more easily when needed.

9f863a15

15 Dec, 2016 1 commit
- bump the VERSION · 8e198336
  Daniel Bartholomew authored Dec 15, 2016
  
  8e198336
14 Dec, 2016 5 commits

Fix broken cmake -DBUILD_CONFIG=mysql_release on Windows. · c13b5011

Vladislav Vaintroub authored Dec 14, 2016

mysql_release.cmake set WITH_JEMALLOC=static, which makes windows
builds fail since there is no jemalloc either static or shared there

c13b5011

MDEV-11479 Improved wsrep_dirty_reads · d93bbcad
Sachin Setiya authored Dec 14, 2016
```
Updated sysvars_wsrep.result file.
```
d93bbcad

MDEV-11060 sql/protocol.cc:532: void Protocol::end_statement(): Assertion `0' failed · f41bd7e5

Varun Gupta authored Dec 13, 2016

In file sql/opt_range.cc,when calculate_cond_selectivity_for_table() is called with optimizer_use_condition_selectivity=4 then
	- thd->no_errors is set to 1
	- the original value of thd->no_error is not restored to its original value
	- this is causing the assertion to fail in the subsequent queries

Fixed by restoring the original value of thd->no_errors

f41bd7e5

MDEV-11479 Improved wsrep_dirty_reads · 0c79de24

Sachin Setiya authored Dec 14, 2016

     Tasks:-
         Changes in wsrep_dirty_reads variable
         1.) Global + Session scope (Current: session-only)
         2.) Can be set using command line.
         3.) Allow all commands that do not change data (besides SELECT)
         4.) Allow prepared Statements that do not change data
         5.) Works with wsrep_sync_wait enabled

0c79de24

Revert "MDEV-11016 wsrep_node_is_ready() check is too strict" · 25a9a3da
Sachin Setiya authored Dec 14, 2016
```
This reverts commit 7ed5563b.
```
25a9a3da

13 Dec, 2016 1 commit

MDEV-10368: get_latest_version() called too often · 72cc73ce

Jan Lindström authored Dec 13, 2016

Reduce the number of calls to encryption_get_key_get_latest_version
when doing key rotation with two different methods:

(1) We need to fetch key information when tablespace not yet
have a encryption information, invalid keys are handled now
differently (see below). There was extra call to detect
if key_id is not found on key rotation.

(2) If key_id is not found from encryption plugin, do not
try fetching new key_version for it as it will fail anyway.
We store return value from encryption_get_key_get_latest_version
call and if it returns ENCRYPTION_KEY_VERSION_INVALID there
is no need to call it again.

72cc73ce

12 Dec, 2016 3 commits

MDEV-10545: Server crashed in my_copy_fix_mb on querying I_S and P_S tables · 67b570af
Nirbhay Choubey authored Dec 05, 2016
```
After applying/replaying the transaction, the memory that
stored the query string was also wrongly freed.
```
67b570af

MDEV-11179: WSREP transaction excceded size limit in Galera cluster · 9c88a54c

Nirbhay Choubey authored Dec 05, 2016

... causes MariaDB to crash

On error, the wsrep replication buffer (binlog) is dumped to a file
to aid investigations. In order to also include the binlog header,
FDLE object is also needed. This object is only available for wsrep-
threads.
Fix: Instantiate an FDLE object for non-wsrep threads.

9c88a54c

MDEV-10954: MariaDB Galera: wsrep_sst_common: line 120: which: command not found · dbb06d2e
Nirbhay Choubey authored Nov 21, 2016
```
Add 'which' to REQUIRES list.
```
dbb06d2e

11 Dec, 2016 1 commit
- Updated the list of unstable tests after the merge · 5d9ca522
  Elena Stepanova authored Dec 12, 2016
  
  5d9ca522