- 01 Aug, 2023 3 commits
-
-
Marko Mäkelä authored
buf_page_t::write_complete(), buf_page_write_complete(), IORequest::write_complete(): Add a parameter for passing an error code. If an error occurred, we will release the io-fix, buffer-fix and page latch but not reset the oldest_modification field. The block would remain in buf_pool.LRU and possibly buf_pool.flush_list, to be written again later, by buf_flush_page_cleaner(). If all page writes start consistently failing, all write threads should eventually hang in log_free_check() because the log checkpoint cannot be advanced to make room in the circular write-ahead-log ib_logfile0. IORequest::read_complete(): Add a parameter for passing an error code. If a read operation fails, we report the error and discard the page, just like we would do if the page checksum was not validated or the page could not be decrypted. This only affects asynchronous reads, due to linear or random read-ahead or crash recovery. When buf_page_get_low() invokes buf_read_page(), that will be a synchronous read, not involving this code. This was tested by randomly injecting errors in write_io_callback() and read_io_callback(), like this: if (!ut_rnd_interval(100)) cb->m_err= 42;
-
Marko Mäkelä authored
buf_LRU_free_page(): The block may also be in the IBUF_EXIST state when executing the test innodb.innodb_bulk_create_index_debug.
-
Marko Mäkelä authored
buf_LRU_free_page(): When we are discarding the uncompressed copy of a ROW_FORMAT=COMPRESSED page, buf_page_t::can_relocate() must have ensured that the block descriptor state is one of FREED, UNFIXED, REINIT. Do not overwrite the state with UNFIXED. We do not want to write back pages that were actually freed, and we want to avoid doublewrite for pages that were (re)initialized by log records written since the latest checkpoint. Last but not least, we do not want crashes like those that commit dc1bd180 (MDEV-31386) was supposed to fix. The test innodb_zip.wl5522_zip should typically cover all 3 states. This bug is a regression due to commit aaef2e1d (MDEV-27058).
-
- 28 Jul, 2023 1 commit
-
-
Marko Mäkelä authored
log_sort_flush_list(): Wait for any pending page writes to cease before sorting the buf_pool.flush_list. Starting with commit 22b62eda (MDEV-25113), it is possible that some buf_page_t::oldest_modification_ that we will be comparing in std::sort() will be updated from some value >2 to 1 while we are holding buf_pool.flush_list_mutex. To catch this type of trouble better in the future, we will clean garbage (pages that have been written out) from buf_pool.flush_list while constructing the array for sorting, and check with debug assertions that all blocks that we are copying from the array to the list will be dirty (requiring a writeback) while we sort and copy the array back to buf_pool.flush_list. This failure was observed by chance exactly once when running the test innodb.recovery_memory. It was never reproduced in the same form afterwards. Unrelated to this change, the test does occasionally reproduce a failure to start up InnoDB due to a corrupted page being read in recovery. The ticket MDEV-31791 was filed for that. Tested by: Matthias Leich
-
- 25 Jul, 2023 1 commit
-
-
Marko Mäkelä authored
The main problem is that at ever since commit aaef2e1d removed the function buf_wait_for_read(), it is not safe to invoke buf_page_get_low() with RW_NO_LATCH, that is, only buffer-fixing the page. If a page read (or decryption or decompression) is in progress, there would be a race condition when executing consistency checks, and a page would wrongly be flagged as corrupted. Furthermore, if the page is actually corrupted and the initial access to it was with RW_NO_LATCH (only buffer-fixing), the page read handler would likely end up in an infinite loop in buf_pool_t::corrupted_evict(). It is not safe to invoke mtr_t::upgrade_buffer_fix() on a block on which a page latch was not initially acquired in buf_page_get_low(). btr_block_reget(): Remove the constant parameter rw_latch=RW_X_LATCH. btr_block_get(): Assert that RW_NO_LATCH is not being used, and change the parameter type of rw_latch. btr_pcur_move_to_next_page(), innobase_table_is_empty(): Adjust for the parameter type change of btr_block_get(). btr_root_block_get(): If mode==RW_NO_LATCH, do not check the integrity of the page, because it is not safe to do so. btr_page_alloc_low(), btr_page_free(): If the root page latch is not previously held by the mini-transaction, invoke btr_root_block_get() again with the proper latching mode. btr_latch_prev(): Helper function to safely acquire a latch on a preceding sibling page while holding a latch on a B-tree page. To avoid deadlocks, we must not wait for the latch while holding a latch on the current page, because another thread may be waiting for our page latch when moving to the next page from our preceding sibling page. If s_lock_try() or x_lock_try() on the preceding page fails, we must release the current page latch, and wait for the latch on the preceding page as well as the current page, in that order. Page splits or merges will be prevented by the parent page latch that we are holding. btr_cur_t::search_leaf(): Make use of btr_latch_prev(). btr_cur_t::open_leaf(): Make use of btr_latch_prev(). Do not invoke mtr_t::upgrade_buffer_fix() (when latch_mode == BTR_MODIFY_TREE), because we will already have acquired all page latches upfront. btr_cur_t::pessimistic_search_leaf(): Do acquire an exclusive index latch before accessing the page. Make use of btr_latch_prev().
-
- 24 Jul, 2023 1 commit
-
-
Marko Mäkelä authored
row_ins_sec_index_entry_low(): Correct a condition that was inadvertently inverted in commit 89ec4b53 (MDEV-29603). We are not supposed to buffer INSERT operations into unique indexes, because duplicate key values would not be checked for. It is only allowed when using unique_checks=0, and in that case the user is supposed to guarantee that there are no duplicates.
-
- 21 Jul, 2023 1 commit
-
-
Sergei Petrunia authored
ANALYZE FORMAT=JSON output now includes table.r_engine_stats which has the engine statistics. Only non-zero members are printed. Internally: EXPLAIN data structures Explain_table_acccess and Explain_update now have handler* handler_for_stats pointer. It is used to read statistics from handler_for_stats->handler_stats. The following applies only to 10.9+, backport doesn't use it: Explain data structures exist after the tables are closed. We avoid walking invalid pointers using this: - SQL layer calls Explain_query::notify_tables_are_closed() before closing tables. - After that call, printing of JSON output is disabled. Non-JSON output can be printed but we don't access handler_for_stats when doing that.
-
- 13 Jul, 2023 2 commits
-
-
Alexey Botchkov authored
'defaults' made into 'constexpr' variables.
-
Monty authored
This issue could only be noticed in very extraordinary circumstances when trying to rename a temporary table that is not in the file system. The issue was found when I temporarly disabled check_if_frm_exists() while searching for another bug.
-
- 12 Jul, 2023 1 commit
-
-
Sergei Petrunia authored
Item_func_tochar::check_arguments() didn't check if its arguments each had one column. Failing to make this check and proceeding would eventually cause either an assertion failure or the execution would reach "MY_ASSERT_UNREACHABLE();" which would produce a crash with a misleading stack trace. * Fixed Item_func_tochar::check_arguments() to do the required check. * Also fixed MY_ASSERT_UNREACHABLE() to terminate the program. Just "executing" __builtin_unreachable() used to cause "undefined results", which in my experience was a crash with corrupted stack trace.
-
- 10 Jul, 2023 1 commit
-
-
Vlad Lesin authored
Before MDEV-24671, the wait time was derived from my_interval_timer() / 1000 (nanoseconds converted to microseconds, and not microseconds to milliseconds like I must have assumed). The lock_sys.wait_time and lock_sys.wait_time_max are already in milliseconds; we should not divide them by 1000. In MDEV-24738 the millisecond counts lock_sys.wait_time and lock_sys.wait_time_max were changed to a 32-bit type. That would overflow in 49.7 days. Keep using a 64-bit type for those millisecond counters. Reviewed by: Marko Mäkelä
-
- 07 Jul, 2023 2 commits
-
-
Monty authored
This is because some plan changes because of views
-
Monty authored
The new statistics is enabled by adding the "engine", "innodb" or "full" option to --log-slow-verbosity Example output: # Pages_accessed: 184 Pages_read: 95 Pages_updated: 0 Old_rows_read: 1 # Pages_read_time: 17.0204 Engine_time: 248.1297 Page_read_time is time doing physical reads inside a storage engine. (Writes cannot be tracked as these are usually done in the background). Engine_time is the time spent inside the storage engine for the full duration of the read/write/update calls. It uses the same code as 'analyze statement' for calculating the time spent. The engine statistics is done with a generic interface that should be easy for any engine to use. It can also easily be extended to provide even more statistics. Currently only InnoDB has counters for Pages_% and Undo_% status. Engine_time works for all engines. Implementation details: class ha_handler_stats holds all engine stats. This class is included in handler and THD classes. While a query is running, all statistics is updated in the handler. In close_thread_tables() the statistics is added to the THD. handler::handler_stats is a pointer to where statistics should be collected. This is set to point to handler::active_handler_stats if stats are requested. If not, it is set to 0. handler_stats has also an element, 'active' that is 1 if stats are requested. This is to allow engines to avoid doing any 'if's while updating the statistics. Cloned or partition tables have the pointer set to the base table if status are requested. There is a small performance impact when using --log-slow-verbosity=engine: - All engine calls in 'select' will be timed. - IO calls for InnoDB reads will be timed. - Incrementation of counters are done on local variables and accesses are inline, so these should have very little impact. - Statistics has to be reset for each statement for the THD and each used handler. This is only 40 bytes, which should be neglectable. - For partition tables we have to loop over all partitions to update the handler_status as part of table_init(). Can be optimized in the future to only do this is log-slow-verbosity changes. For this to work we have to update handler_status for all opened partitions and also for all partitions opened in the future. Other things: - Added options 'engine' and 'full' to log-slow-verbosity. - Some of the new files in the test suite comes from Percona server, which has similar status information. - buf_page_optimistic_get(): Do not increment any counter, since we are only validating a pointer, not performing any buf_pool.page_hash lookup. - Added THD argument to save_explain_data_intern(). - Switched arguments for save_explain_.*_data() to have always THD first (generates better code as other functions also have THD first).
-
- 05 Jul, 2023 3 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
fil_node_open_file_low(): Always acquire an advisory lock on the system tablespace. Originally, we already did this in SysTablespace::open_file(), but SysTablespace::open_or_create() would release those locks when it is closing the file handles. This is a 10.5+ specific follow up to commit 0ee1082b (MDEV-28495). Thanks to Daniel Black for verifying this bug.
-
Marko Mäkelä authored
When InnoDB attempts to buffer a change operation of a secondary index leaf page (to insert, delete-mark or remove a record) and the change buffer is too large, InnoDB used to trigger a change buffer merge that could affect any tables. This could lead to huge variance in system throughput and potentially unpredictable crashes, in case the change buffer was corrupted and a crash occurred while attempting to merge changes to a table that is not being accessed by the current SQL statement. ibuf_insert_low(): Simply return DB_STRONG_FAIL when the maximum size of the change buffer is exceeded. ibuf_contract_after_insert(): Remove. ibuf_get_merge_page_nos_func(): Remove a constant parameter. The function ibuf_contract() will be our only caller, during shutdown with innodb_fast_shutdown=0.
-
- 04 Jul, 2023 10 commits
-
-
Sergei Golubchik authored
debug-only issue. the test was doing set debug_sync='now SIGNAL go3'; ... set debug_sync='reset'; which translated into add "go3" to the hash of active signals pthread_broadcast to wake up waiting threads ... clear the hash of active signals as a result a waiting thread was awoken, but the hash was emptied before the thread checked if its signal was in the hash. so the thread didn't find its signal and went back to sleep. let's wait until the awoken thread has completely finished disconnecting and was added to the thread cache.
-
Sergei Golubchik authored
debug_sync refactoring introduced a statically instantiated object debug_sync_global of the structure st_debug_sync_globals. st_debug_sync_globals includes Hash_set<> which allocates memory in the constructor. sf_malloc() calls _my_thread_var()->dbug_id which is pthread_getspecific(THR_KEY_mysys), and THR_KEY_mysys is 0 before pthread_key_create(). pthread_getspecific(0) returns a valid pointer, not EINVAL. And safemalloc dereferences it. let's statically initialize THR_KEY_mysys to -1, this makes pthread_getspecific(THR_KEY_mysys) to fail before pthread_key_create() is called. followup for 8885225d
-
Sergei Golubchik authored
-
Sergei Golubchik authored
now the main class is the Type_handler_fbt. It contains everything and generates objects of all other classes as needed.
-
Sergei Golubchik authored
it was calling Fbt::to_string(char*, size_t) which didn't exist
-
Sergei Golubchik authored
-
Sergei Golubchik authored
it's a hard-coded geometry-specific method, move it into geometry-specific function. Will go away in the future.
-
Sergei Golubchik authored
-
Sergei Golubchik authored
add the test case
-
Marko Mäkelä authored
dict_stats_persistent_storage_check(): Do not output errmsg if opt_bootstrap holds, because the message buffer would likely be uninitialized.
-
- 03 Jul, 2023 3 commits
-
-
Daniel Black authored
file /usr/lib64/libmariadb.so.3 from install of MariaDB-shared-10.11.3-1.fc38.x86_64 conflicts with file from package mariadb-connector-c-3.3.5-1.fc38.x86_64
-
Marko Mäkelä authored
ha_innobase::delete_table(): Also on DROP SEQUENCE, do try to drop any persistent statistics. They should really not be created for SEQUENCE objects (which internally are 1-row no-rollback tables), but that is how happened to always work.
-
Marko Mäkelä authored
i_s_innodb_buffer_page_get_info(): Correct a condition. After crash recovery, there may be some buffer pool pages in FREED state, containing garbage (invalid data page contents). Let us ignore such pages in the INFORMATION_SCHEMA output. The test innodb.innodb_defragment_fill_factor will be removed, because the queries that it is invoking on information_schema.innodb_buffer_page would start to fail. The defragmentation feature was removed in commit 7ca89af6 in MariaDB Server 11.1. Tested by: Matthias Leich
-
- 30 Jun, 2023 2 commits
-
-
Marko Mäkelä authored
btr_search_hash_table_validate(), btr_search_validate(): Add the parameter THD for checking if the statement has been killed. Any non-QUICK CHECK TABLE will validate the entire adaptive hash index for all InnoDB tables, which may be extremely slow when running multiple concurrent CHECK TABLE.
-
Oleg Smirnov authored
During the upgrade procedure on Windows mysqld.exe is started with the named pipe connection protocol. mysqladmin.exe then pings the server to check if is up and running. Command line looks like: mysqladmin.exe --protocol=pipe --socket=mysql_upgrade_service_xxx ping But the "socket" parameter resets the "protocol" which was previously initialized with the "pipe" value, setting it to "socket". As a result, connection cannot be established and the upgrade procedure fails. "socket" in Windows is used to pass the name of the pipe so resetting the protocol is not valid in this case. This commit fixes resetting of the "protocol" parameter with "socket" parameter in the case when protocol has been previously initialized to "pipe" value
-
- 29 Jun, 2023 2 commits
-
-
Alexander Barkov authored
When CURSOR parameters get parsed, their sp_assignment_lex instances (one instance per parameter) get collected to List<sp_assignment_lex>. These instances get linked to sphead only in the end of the list. If a syntax error happened in the middle of the parameter list, these instances were not deleted, which caused memory leaks. Fix: using a Bison %destructor to free rules of the <sp_assignment_lex_list> type (on syntax errors). Afte the fix these sp_assignment_lex instances from CURSOR parameters deleted as follows: - If the CURSOR statement was fully parsed, then these instances get properly linked to sp_head structures, so they are deleted during ~sp_head (this did not change) - If the CURSOR statement failed on a syntax error, then by Bison's %destructor (this is being added in the current patch).
-
Alexander Barkov authored
The parser works as follows: The rule expr_lex returns a pointer to a newly created sp_expr_lex instance which is not linked to any MariaDB structures yet - it is pointed only from a Bison stack variable. The sp_expr_lex instance gets linked to other structures (such as sp_instr_jump_if_not) later, after scanning some following grammar. Problem before the fix: If a parse error happened immediately after expr_lex (before it got linked), the created sp_expr_lex value got lost causing a memory leak. Fix: - Using Bison's "destructor" directive to free the results of expr_lex on parse/oom errors. - Moving the call for LEX::cleanup_lex_after_parse_error() from MYSQL_YYABORT and yyerror inside parse_sql(). This is needed because Bison calls destructors after yyerror(), while it's important to delete the sp_expr_lex instance before LEX::cleanup_lex_after_parse_error(). The latter frees the memory root containing the sp_expr_lex instance. After this change the code block are executed in the following order: - yyerror() -- now only raises the error to DA (no cleanup done any more) - %destructor { delete $$; } <expr_lex> -- destructs the sp_expr_lex instance - LEX::cleanup_lex_after_parse_error() -- frees the memory root containing the sp_expr_lex instance - Removing the "delete sublex" related code from restore_lex(): - restore_lex() is called in most cases on success, when delete is not needed. - There is one place when restore_lex() is called on error: In sp_create_assignment_instr(). But in this case LEX::sp_lex_in_use is true anyway. The patch adds a new DBUG_ASSERT(lex->sp_lex_in_use) to guard this.
-
- 28 Jun, 2023 4 commits
-
-
Marko Mäkelä authored
-
Vlad Lesin authored
The fix is in replacing the waiting for the whole purge finishing with the the waiting for only delete-marked records purging finishing. Reviewed by: Marko Mäkelä
-
Vlad Lesin authored
The issue is caused by MDEV-30400 fix. There are two cursors in btr_estimate_n_rows_in_range() - p1 and p2, but both share the same mtr. Each cursor contains mtr savepoint for the previously fetched block to release it then the current block is fetched. Before MDEV-30400 the block was released with mtr_t::release_block_at_savepoint(), it just unfixed a block and released its page patch. In MDEV-30400 it was replaced with mtr_t::rollback_to_savepoint(), which does the same as the former mtr_t::release_block_at_savepoint(ulint begin, ulint end) but also erases the corresponding slots from mtr memo, what invalidates any stored mtr's memo savepoints, greater or equal to "begin". The idea of the fix is to get rid of savepoints at all in btr_estimate_n_rows_in_range() and btr_estimate_n_rows_in_range_on_level(). As mtr_t::rollback_to_savepoint() erases elements from mtr_t::m_memo, we know what element of mtr_t::m_memo can be deleted on the certain case, so there is no need to store savepoints. See also the following slides for details: https://docs.google.com/presentation/d/1RFYBo7EUhM22ab3GOYctv3j_3yC0vHtBY9auObZec8U Reviewed by: Marko Mäkelä
-
Yuchen Pei authored
The server needs to have a unique name
-
- 27 Jun, 2023 2 commits
-
-
Marko Mäkelä authored
-
Marko Mäkelä authored
recv_sys_t::parse(): For undo tablespace truncation mini-transactions, remember the start_lsn instead of the end LSN. This is what we expect after commit 461402a5 (MDEV-30479).
-
- 26 Jun, 2023 1 commit
-
-
Marko Mäkelä authored
-