- 14 Dec, 2020 1 commit
-
-
Marko Mäkelä authored
After commit a5a2ef07 (part of MDEV-23855) implemented asynchronous doublewrite, it is possible that the server will hang when the following parametes are in effect: innodb_doublewrite=1 (default) innodb_write_io_threads=1 innodb_use_native_aio=0 Note: In commit 5e62b6a5 (MDEV-16264) the logic of os_aio_init() was changed so that it will never fail, but instead automatically disable innodb_use_native_aio (which is enabled by default) if the io_setup() system call would fail due to resource limits being exceeded. Before commit a5a2ef07, we used a synchronous write for the doublewrite buffer batches, always at most 64 pages at a time. So, upon completing a doublewrite batch, a single thread would submit at most 64 page writes (for the individual pages that were first written to the doublewrite buffer). With that commit, we may submit up to 128 page writes at a time. The maximum number of outstanding requests per thread is 256. Because the maximum number of asynchronous write submissions per thread was roughly doubled, it is now possible that buf_dblwr_t::flush_buffered_writes_completed() will hang in io_slots::acquire(), called via os_aio() and fil_space_t::io(), when submitting writes of the individual blocks. We will prevent this type of hang by increasing the minimum number of innodb_write_io_threads from 1 to 2, so that this type of hang would only become possible when 512 outstanding write requests are exceeded.
-
- 11 Dec, 2020 1 commit
-
-
Marko Mäkelä authored
We observed a race condition that involved two threads executing fil_flush_file_spaces() and one thread executing fil_delete_tablespace(). After one of the fil_flush_file_spaces() observed that space.needs_flush_not_stopping() is set and was releasing the fil_system.mutex, the other fil_flush_file_spaces() would complete the execution of fil_space_t::flush_low() on the same tablespace. Then, fil_delete_tablespace() would destroy the object, because the value of fil_space_t::n_pending did not prevent that. Finally, the fil_flush_file_spaces() would resume execution and invoke fil_space_t::flush_low() on the freed object. This race condition was introduced in commit 118e258a of MDEV-23855. fil_space_t::flush(): Add a template parameter that indicates whether the caller is holding a reference to prevent the tablespace from being freed. buf_dblwr_t::flush_buffered_writes_completed(), row_quiesce_table_start(): Acquire a reference for the duration of the fil_space_t::flush_low() operation. It should be impossible for the object to be freed in these code paths, but we want to satisfy the debug assertions. fil_space_t::flush_low(): Do not increment or decrement the reference count, but instead assert that the caller is holding a reference. fil_space_extend_must_retry(), fil_flush_file_spaces(): Acquire a reference before releasing fil_system.mutex. This is what will fix the race condition.
-
- 09 Dec, 2020 2 commits
-
-
Marko Mäkelä authored
Since commit ea21d630 we conditionally define a variable that only plays a role on systems that support hole-punching (explicit creation of sparse files). However, that broke debug builds on such systems. It turns out that the debug_dbug label "ignore_punch_hole" is not at all used in MariaDB server. It would be covered by the MySQL 5.7 test innodb.table_compress. (Note: MariaDB 10.1 implemented page_compressed tables before something comparable appeared in MySQL 5.7.)
-
Marko Mäkelä authored
The flushing of the InnoDB temporary tablespace is unnecessarily tied to the write-ahead redo logging and redo log checkpoints, which must be tied to the page writes of persistent tablespaces. Let us simply omit any pages of temporary tables from buf_pool.flush_list. In this way, log checkpoints will never incur any 'collateral damage' of writing out unmodified changes for temporary tables. After this change, pages of the temporary tablespace can only be written out by buf_flush_lists(n_pages,0) as part of LRU eviction. Hopefully, most of the time, that code will never be executed, and instead, the temporary pages will be evicted by buf_release_freed_page() without ever being written back to the temporary tablespace file. This should improve the efficiency of the checkpoint flushing and the buf_flush_page_cleaner thread. Reviewed by: Vladislav Vaintroub
-
- 08 Dec, 2020 3 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
MDEV-24278 improved the page cleaner so that it will no longer wake up once per second on an idle server. However, with innodb_adaptive_flushing (the default) the function page_cleaner_flush_pages_recommendation() could initially return 0 even if there is work to do. af_get_pct_for_dirty(): Remove. Based on a comment here, it appears that an initial intention of innodb_max_dirty_pages_pct_lwm=0.0 (the default value) was to disable something. That ceased to hold in MDEV-23855: the value is a pure threshold; the page cleaner will not perform any work unless the threshold is exceeded. page_cleaner_flush_pages_recommendation(): Add the parameter dirty_blocks to ensure that buf_pool.flush_list will eventually be emptied.
-
Sergei Petrunia authored
..causes error on slave. Cause: if the master doesn't have the frm file for the table, DROP TABLE code will call ha_delete_table_force() to drop the table in all available storage engines. The issue was that this code path didn't check for HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE flag for the storage engine, and so did not add "... IF EXISTS" to the statement that's written to the binary log. This can cause error on the slave when it tries to drop a table that's already gone.
-
- 07 Dec, 2020 1 commit
-
-
Vladislav Vaintroub authored
-
- 04 Dec, 2020 2 commits
-
-
Marko Mäkelä authored
The counters in srv_stats use std::atomic and multiple cache lines per counter. This is an overkill in a case where a critical section already exists in the code. A regular variable will work just fine, with much smaller memory bus impact.
-
Marko Mäkelä authored
This hang was caused by MDEV-23855, and we failed to fix it in MDEV-24109 (commit 4cbfdeca). When buf_flush_ahead() is invoked soon before server shutdown and the non-default setting innodb_flush_sync=OFF is in effect and the buffer pool contains dirty pages of temporary tables, the page cleaner thread may remain in an infinite loop without completing its work, thus causing the shutdown to hang. buf_flush_page_cleaner(): If the buffer pool contains no unmodified persistent pages, ensure that buf_flush_sync_lsn= 0 will be assigned, so that shutdown will proceed. The test case is not deterministic. On my system, it reproduced the hang with 95% probability when running multiple instances of the test in parallel, and 4% when running single-threaded. Thanks to Eugene Kosov for debugging and testing this.
-
- 03 Dec, 2020 2 commits
-
-
Monty authored
This was noticed wben running "mtr --valgrind main.precedence" The problem was that Item_func_like::escape could be left unitialized when used with views combined with UNIONS like in: create or replace view v1 as select 2 LIKE 1 ESCAPE 3 IN (SELECT 0 UNION SELECT 1), 2 LIKE 1 ESCAPE (3 IN (SELECT 0 UNION SELECT 1)), (2 LIKE 1 ESCAPE 3) IN (SELECT 0 UNION SELECT 1); The above query causes in fix_escape_item() escape_item->const_during_execution() to be true and escape_item->const_item() to be false in which case 'escape' is never calculated. The fix is to make the main logic of fix_escape_item() out to a separate function and call that function once in Item. Other things: - Reorganized fields in Item_func_like class to make it more compact
-
Marko Mäkelä authored
The clang++ -stdlib=libc++ header file <fstream> depends on <filesystem> that defines a member function path::root_name(), which conflicts with the rather unused #define root_name() that had been introduced in commit 7c58e97b. Because an instrumented -stdlib=libc++ (rather than the default -stdlib=libstdc++) is easier to build for a working -fsanitize=memory (cmake -DWITH_MSAN=ON), let us remove the conflicting #define for now.
-
- 02 Dec, 2020 5 commits
-
-
Marko Mäkelä authored
Sorry, only tested commit 4174fc1a on clang. Other compilers do not define __has_feature().
-
Marko Mäkelä authored
For some reason, commit 5bb5d4ad made clang++-11 unhappy about a constexpr declaration.
-
Marko Mäkelä authored
For some reason, the test was never adjusted for commit e6a50e41.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
The Galera tests were massively failing with debug assertions.
-
- 01 Dec, 2020 7 commits
-
-
Marko Mäkelä authored
-
Vlad Lesin authored
Post-push Windows compilation errors fix.
-
Monty authored
Change thd->mdl_context.release_transactional_locks() to thd->mdl_release_transactional_locks()
-
Marko Mäkelä authored
row_undo_ins_parse_undo_rec(): Do not try to read non-existing virtual column information for the metadata record.
-
Marko Mäkelä authored
-
Marko Mäkelä authored
The replacement is buf_pool.contains_zip().
-
Vlad Lesin authored
The new option --log-innodb-page-corruption is introduced. When this option is set, backup is not interrupted if innodb corrupted page is detected. Instead it logs all found corrupted pages in innodb_corrupted_pages file in backup directory and finishes with error. For incremental backup corrupted pages are also copied to .delta file, because we can't do LSN check for such pages during backup, innodb_corrupted_pages will also be created in incremental backup directory. During --prepare, corrupted pages list is read from the file just after redo log is applied, and each page from the list is checked if it is allocated in it's tablespace or not. If it is not allocated, then it is zeroed out, flushed to the tablespace and removed from the list. If all pages are removed from the list, then --prepare is finished successfully and innodb_corrupted_pages file is removed from backup directory. Otherwise --prepare is finished with error message and innodb_corrupted_pages contains the list of the pages, which are detected as corrupted during backup, and are allocated in their tablespaces, what means backup directory contains corrupted innodb pages, and backup can not be considered as consistent. For incremental --prepare corrupted pages from .delta files are applied to the base backup, innodb_corrupted_pages is read from both base in incremental directories, and the same action is proceded for corrupted pages list as for full --prepare. innodb_corrupted_pages file is modified or removed only in base directory. If DDL happens during backup, it is also processed at the end of backup to have correct tablespace names in innodb_corrupted_pages.
-
- 30 Nov, 2020 11 commits
-
-
Monty authored
The reason for the failure is that thd->mdl_context.release_transactional_locks() was called after commit & rollback even in cases where the current transaction is still active. For 10.2, 10.3 and 10.4 the fix is simple: - Replace all calls to thd->mdl_context.release_transactional_locks() with thd->release_transactional_locks(). The thd function will only call the mdl_context function if there are no active transactional locks. In 10.6 we will better fix where we will change the return value for some trans_xxx() functions to indicate if transaction did close the transaction or not. This will avoid the need of the indirect call. Other things: - trans_xa_commit() and trans_xa_rollback() will automatically call release_transactional_locks() if the transaction is closed. - We can't do that for the other functions as the caller of many of these are doing additional work (like close_thread_tables) before calling release_transactional_locks(). - Added missing abort_result_set() and missing DBUG_RETURN in select_create::send_eof() - Fixed wrong indentation in injector::transaction::commit()
-
Monty authored
-
Monty authored
The real fix for MDEV-15532 will be pushed into 10.2 and 10.6 This is an additional fix for 10.4. In 10.4 trans_xa_detach was introduced. However THD::cleanup() assumes that after trans_xa_detach() is done, there is no registered transactions anymore. In the 10.2 patch there will be an assert to ensure this, which will cause 10.4 to fail. The fix used is to reset the transaction flags in trans_xa_detach().
-
Monty authored
-
Vladislav Vaintroub authored
- the intention for my_getevents syscall is now better explained, why are we using it (to be able to interrupt io_getevents syscall via io_destroy()). - Fix comment for MAX_EVENTS in getevent_thread_routine. MAX_EVENTS is more of less arbitrary constant, chosen such that events array is big enough to get multiple simultaneous io completions, but small enough so it does not blow the thread's stack.
-
Vladislav Vaintroub authored
If maintenance timer does not do much for prolonged time, it will wake up less frequently, once every 4 seconds instead of once every 0.4 second. It will wakeup more often if thread creation is throttled, to avoid stalls.
-
Monty authored
-
Sergei Petrunia authored
-
Marko Mäkelä authored
For some reason, InnoDB debug tests on Windows fail due to rw_lock_t if the function call overhead for some os_thread_ code is removed. This change worked fine on Windows in combination with MDEV-24142.
-
Varun Gupta authored
MDEV-21265: IN predicate conversion to IN subquery should be allowed for a broader set of datatype comparison Allow materialization strategy when collations on the inner and outer sides of an IN subquery are the same and the character set of the inner side is a proper subset of the character set on the outer side. This allows conversion from utf8mb3 to utf8mb4 as the former is a subset of the later. This is only allowed when IN predicate is converted to an IN subquery Backported part of the patch (d6a00d9b) of MDEV-17905.
-
Marko Mäkelä authored
os_thread_pf(): Remove. os_thread_eq(), os_thread_yield(), os_thread_get_curr_id(): Define as macros. ut_print_timestamp(), ut_sprintf_timestamp(): Simplify.
-
- 27 Nov, 2020 1 commit
-
-
Igor Babaev authored
When executing set operations in a pipeline using only one temporary table additional scans of intermediate results may be needed. The scans are performed with usage of the rnd_next() handler function that might leave record buffers used for the temporary table not in a state that is good for following writes into the table. For example it happens for aria engine when the last call of rnd_next() encounters only deleted records. Thus a cleanup of record buffers is needed after each such scan of the temporary table. Approved by Oleksandr Byelkin <sanja@mariadb.com>
-
- 26 Nov, 2020 4 commits
-
-
Monty authored
-
Monty authored
-
Monty authored
This change is needed in 10.5 to avoid extra malloc calls in val_str(). In 10.6 it's not needed anymore but the extra +1 byte doesn't harm that much.
-
Monty authored
- Fold long comment rows and updated comments - Moved one private function in class Item_func_rand among other private functions
-