Commit de4030e4 authored by Marko Mäkelä's avatar Marko Mäkelä

MDEV-30400 Assertion height == btr_page_get_level(...) on INSERT

This also fixes part of MDEV-29835 Partial server freeze
which is caused by violations of the latching order that was
defined in https://dev.mysql.com/worklog/task/?id=6326
(WL#6326: InnoDB: fix index->lock contention). Unless the
current thread is holding an exclusive dict_index_t::lock,
it must acquire page latches in a strict parent-to-child,
left-to-right order. Not all cases of MDEV-29835 are fixed yet.
Failure to follow the correct latching order will cause deadlocks
of threads due to lock order inversion.

As part of these changes, the BTR_MODIFY_TREE mode is modified
so that an Update latch (U a.k.a. SX) will be acquired on the
root page, and eXclusive latches (X) will be acquired on all pages
leading to the leaf page, as well as any left and right siblings
of the pages along the path. The DEBUG_SYNC test innodb.innodb_wl6326
will be removed, because at the time the DEBUG_SYNC point is hit,
the thread is actually holding several page latches that will be
blocking a concurrent SELECT statement.

We also remove double bookkeeping that was caused due to excessive
information hiding in mtr_t::m_memo. We simply let mtr_t::m_memo
store information of latched pages, and ensure that
mtr_memo_slot_t::object is never a null pointer.
The tree_blocks[] and tree_savepoints[] were redundant.

buf_page_get_low(): If innodb_change_buffering_debug=1, to avoid
a hang, do not try to evict blocks if we are holding a latch on
a modified page. The test innodb.innodb-change-buffer-recovery
will be removed, because change buffering may no longer be forced
by debug injection when the change buffer comprises multiple pages.
Remove a debug assertion that could fail when
innodb_change_buffering_debug=1 fails to evict a page.
For other cases, the assertion is redundant, because we already
checked that right after the got_block: label. The test
innodb.innodb-change-buffering-recovery will be removed, because
due to this change, we will be unable to evict the desired page.

mtr_t::lock_register(): Register a change of a page latch
on an unmodified buffer-fixed block.

mtr_t::x_latch_at_savepoint(), mtr_t::sx_latch_at_savepoint():
Replaced by the use of mtr_t::upgrade_buffer_fix(), which now
also handles RW_S_LATCH.

mtr_t::set_modified(): For temporary tables, invoke
buf_page_t::set_modified() here and not in mtr_t::commit().
We will never set the MTR_MEMO_MODIFY flag on other than
persistent data pages, nor set mtr_t::m_modifications when
temporary data pages are modified.

mtr_t::commit(): Only invoke the buf_flush_note_modification() loop
if persistent data pages were modified.

mtr_t::get_already_latched(): Look up a latched page in mtr_t::m_memo.
This avoids many redundant entries in mtr_t::m_memo, as well as
redundant calls to buf_page_get_gen() for blocks that had already
been looked up in a mini-transaction.

btr_get_latched_root(): Return a pointer to an already latched root page.
This replaces btr_root_block_get() in cases where the mini-transaction
has already latched the root page.

btr_page_get_parent(): Fetch a parent page that was already latched
in BTR_MODIFY_TREE, by invoking mtr_t::get_already_latched().
If needed, upgrade the root page U latch to X.
This avoids bloating mtr_t::m_memo as well as performing redundant
buf_pool.page_hash lookups. For non-QUICK CHECK TABLE as well as for
B-tree defragmentation, we will invoke btr_cur_search_to_nth_level().

btr_cur_search_to_nth_level(): This will only be used for non-leaf
(level>0) B-tree searches that were formerly named BTR_CONT_SEARCH_TREE
or BTR_CONT_MODIFY_TREE. In MDEV-29835, this function could be
removed altogether, or retained for the case of
CHECK TABLE without QUICK.

btr_cur_t::left_block: Remove. btr_pcur_move_backward_from_page()
can retrieve the left sibling from the end of mtr_t::m_memo.

btr_cur_t::open_leaf(): Some clean-up.

btr_cur_t::search_leaf(): Replaces btr_cur_search_to_nth_level()
for searches to level=0 (the leaf level). We will never release
parent page latches before acquiring leaf page latches. If we need to
temporarily release the level=1 page latch in the BTR_SEARCH_PREV or
BTR_MODIFY_PREV latch_mode, we will reposition the cursor on the
child node pointer so that we will land on the correct leaf page.

btr_cur_t::pessimistic_search_leaf(): Implement new BTR_MODIFY_TREE
latching logic in the case that page splits or merges will be needed.
The parent pages (and their siblings) should already be latched on
the first dive to the leaf and be present in mtr_t::m_memo; there
should be no need for BTR_CONT_MODIFY_TREE. This pre-latching almost
suffices; it must be revised in MDEV-29835 and work-arounds removed
for cases where mtr_t::get_already_latched() fails to find a block.

rtr_search_to_nth_level(): A SPATIAL INDEX version of
btr_search_to_nth_level() that can search to any level
(including the leaf level).

rtr_search_leaf(), rtr_insert_leaf(): Wrappers for
rtr_search_to_nth_level().

rtr_search(): Replaces rtr_pcur_open().

rtr_latch_leaves(): Replaces btr_cur_latch_leaves(). Note that unlike
in the B-tree code, there is no error handling in case the sibling
pages are corrupted.

rtr_cur_restore_position(): Remove an unused constant parameter.

btr_pcur_open_on_user_rec(): Remove the constant parameter
mode=PAGE_CUR_GE.

row_ins_clust_index_entry_low(): Use a new
mode=BTR_MODIFY_ROOT_AND_LEAF to gain access to the root page
when mode!=BTR_MODIFY_TREE, to write the PAGE_ROOT_AUTO_INC.

BTR_SEARCH_TREE, BTR_CONT_SEARCH_TREE: Remove.

BTR_CONT_MODIFY_TREE: Note that this is only used by
rtr_search_to_nth_level().

btr_pcur_optimistic_latch_leaves(): Replaces
btr_cur_optimistic_latch_leaves().

ibuf_delete_rec(): Acquire exclusive ibuf.index->lock in order
to avoid a deadlock with ibuf_insert_low(BTR_MODIFY_PREV).

btr_blob_log_check_t(): Acquire a U latch on the root page,
so that btr_page_alloc() in btr_store_big_rec_extern_fields()
will avoid a deadlock.

btr_store_big_rec_extern_fields(): Assert that the root page latch
is being held.

Tested by: Matthias Leich
Reviewed by: Vladislav Lesin
parent 39f46745
#
# Bug#69122 - INNODB DOESN'T REDO-LOG INSERT BUFFER MERGE
# OPERATION IF IT IS DONE IN-PLACE
#
call mtr.add_suppression("InnoDB: innodb_read_only prevents crash recovery");
call mtr.add_suppression("Plugin initialization aborted at srv0start\\.cc");
call mtr.add_suppression("Plugin 'InnoDB'");
FLUSH TABLES;
CREATE TABLE t1(
a INT AUTO_INCREMENT PRIMARY KEY,
b CHAR(1),
c INT,
INDEX(b))
ENGINE=InnoDB STATS_PERSISTENT=0;
SET GLOBAL innodb_change_buffering_debug = 1;
SET GLOBAL innodb_change_buffering = all;
INSERT INTO t1 SELECT 0,'x',1 FROM seq_1_to_8192;
BEGIN;
SELECT b FROM t1 LIMIT 3;
b
x
x
x
connect con1,localhost,root,,;
BEGIN;
DELETE FROM t1 WHERE a=1;
INSERT INTO t1 VALUES(1,'X',1);
SET DEBUG_DBUG='+d,crash_after_log_ibuf_upd_inplace';
SELECT b FROM t1 LIMIT 3;
ERROR HY000: Lost connection to server during query
disconnect con1;
connection default;
FOUND 1 /Wrote log record for ibuf update in place operation/ in mysqld.1.err
# restart: --innodb-read-only
CHECK TABLE t1;
Table Op Msg_type Msg_text
test.t1 check Error Unknown storage engine 'InnoDB'
test.t1 check error Corrupt
FOUND 1 /innodb_read_only prevents crash recovery/ in mysqld.1.err
# restart: --innodb-force-recovery=5
SELECT * FROM t1 LIMIT 1;
a b c
1 X 1
SHOW ENGINE INNODB STATUS;
Type Name Status
InnoDB insert 0, delete mark 0
SET GLOBAL innodb_fast_shutdown=0;
# restart
CHECK TABLE t1;
Table Op Msg_type Msg_text
test.t1 check status OK
SHOW ENGINE INNODB STATUS;
Type Name Status
InnoDB
DROP TABLE t1;
This diff is collapsed.
--echo #
--echo # Bug#69122 - INNODB DOESN'T REDO-LOG INSERT BUFFER MERGE
--echo # OPERATION IF IT IS DONE IN-PLACE
--echo #
--source include/have_innodb.inc
# innodb_change_buffering_debug option is debug only
--source include/have_debug.inc
# Embedded server does not support crashing
--source include/not_embedded.inc
# DBUG_SUICIDE() hangs under valgrind
--source include/not_valgrind.inc
# This test is slow on buildbot.
--source include/big_test.inc
--source include/have_sequence.inc
call mtr.add_suppression("InnoDB: innodb_read_only prevents crash recovery");
call mtr.add_suppression("Plugin initialization aborted at srv0start\\.cc");
call mtr.add_suppression("Plugin 'InnoDB'");
FLUSH TABLES;
CREATE TABLE t1(
a INT AUTO_INCREMENT PRIMARY KEY,
b CHAR(1),
c INT,
INDEX(b))
ENGINE=InnoDB STATS_PERSISTENT=0;
--let $_server_id= `SELECT @@server_id`
--let $_expect_file_name= $MYSQLTEST_VARDIR/tmp/mysqld.$_server_id.expect
# The flag innodb_change_buffering_debug is only available in debug builds.
# It instructs InnoDB to try to evict pages from the buffer pool when
# change buffering is possible, so that the change buffer will be used
# whenever possible.
SET GLOBAL innodb_change_buffering_debug = 1;
SET GLOBAL innodb_change_buffering = all;
let SEARCH_FILE = $MYSQLTEST_VARDIR/log/mysqld.1.err;
# Create enough rows for the table, so that the change buffer will be
# used for modifying the secondary index page. There must be multiple
# index pages, because changes to the root page are never buffered.
INSERT INTO t1 SELECT 0,'x',1 FROM seq_1_to_8192;
BEGIN;
SELECT b FROM t1 LIMIT 3;
connect (con1,localhost,root,,);
BEGIN;
DELETE FROM t1 WHERE a=1;
# This should be buffered, if innodb_change_buffering_debug = 1 is in effect.
INSERT INTO t1 VALUES(1,'X',1);
SET DEBUG_DBUG='+d,crash_after_log_ibuf_upd_inplace';
--exec echo "wait" > $_expect_file_name
--error 2013
# This should force a change buffer merge
SELECT b FROM t1 LIMIT 3;
disconnect con1;
connection default;
let SEARCH_PATTERN=Wrote log record for ibuf update in place operation;
--source include/search_pattern_in_file.inc
--let $restart_parameters= --innodb-read-only
--source include/start_mysqld.inc
CHECK TABLE t1;
--source include/shutdown_mysqld.inc
let SEARCH_PATTERN=innodb_read_only prevents crash recovery;
--source include/search_pattern_in_file.inc
--let $restart_parameters= --innodb-force-recovery=5
--source include/start_mysqld.inc
SELECT * FROM t1 LIMIT 1;
replace_regex /.*operations:.* (insert.*), delete \d.*discarded .*/\1/;
SHOW ENGINE INNODB STATUS;
# Slow shutdown will not merge the changes due to innodb_force_recovery=5.
SET GLOBAL innodb_fast_shutdown=0;
--let $restart_parameters=
--source include/restart_mysqld.inc
CHECK TABLE t1;
replace_regex /.*operations:.* insert [1-9][0-9]*, delete mark [1-9][0-9]*, delete \d.*discarded .*//;
SHOW ENGINE INNODB STATUS;
DROP TABLE t1;
This diff is collapsed.
...@@ -61,3 +61,15 @@ select count(*) from t1 where MBRWithin(t1.c2, @g1); ...@@ -61,3 +61,15 @@ select count(*) from t1 where MBRWithin(t1.c2, @g1);
count(*) count(*)
57344 57344
drop table t1; drop table t1;
#
# MDEV-30400 Assertion height == btr_page_get_level ... on INSERT
#
CREATE TABLE t1 (c POINT NOT NULL,SPATIAL (c)) ENGINE=InnoDB;
SET @save_limit=@@GLOBAL.innodb_limit_optimistic_insert_debug;
SET GLOBAL innodb_limit_optimistic_insert_debug=2;
BEGIN;
INSERT INTO t1 SELECT POINTFROMTEXT ('POINT(0 0)') FROM seq_1_to_6;
ROLLBACK;
SET GLOBAL innodb_limit_optimistic_insert_debug=@save_limit;
DROP TABLE t1;
# End of 10.6 tests
...@@ -73,3 +73,18 @@ select count(*) from t1 where MBRWithin(t1.c2, @g1); ...@@ -73,3 +73,18 @@ select count(*) from t1 where MBRWithin(t1.c2, @g1);
# Clean up. # Clean up.
drop table t1; drop table t1;
--echo #
--echo # MDEV-30400 Assertion height == btr_page_get_level ... on INSERT
--echo #
CREATE TABLE t1 (c POINT NOT NULL,SPATIAL (c)) ENGINE=InnoDB;
SET @save_limit=@@GLOBAL.innodb_limit_optimistic_insert_debug;
SET GLOBAL innodb_limit_optimistic_insert_debug=2;
BEGIN;
INSERT INTO t1 SELECT POINTFROMTEXT ('POINT(0 0)') FROM seq_1_to_6;
ROLLBACK;
SET GLOBAL innodb_limit_optimistic_insert_debug=@save_limit;
DROP TABLE t1;
--echo # End of 10.6 tests
This diff is collapsed.
This diff is collapsed.
/***************************************************************************** /*****************************************************************************
Copyright (C) 2012, 2014 Facebook, Inc. All Rights Reserved. Copyright (C) 2012, 2014 Facebook, Inc. All Rights Reserved.
Copyright (C) 2014, 2022, MariaDB Corporation. Copyright (C) 2014, 2023, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -280,6 +280,70 @@ btr_defragment_calc_n_recs_for_size( ...@@ -280,6 +280,70 @@ btr_defragment_calc_n_recs_for_size(
return n_recs; return n_recs;
} }
MY_ATTRIBUTE((nonnull(2,3,4), warn_unused_result))
/************************************************************//**
Returns the upper level node pointer to a page. It is assumed that mtr holds
an sx-latch on the tree.
@return rec_get_offsets() of the node pointer record */
static
rec_offs*
btr_page_search_father_node_ptr(
rec_offs* offsets,/*!< in: work area for the return value */
mem_heap_t* heap, /*!< in: memory heap to use */
btr_cur_t* cursor, /*!< in: cursor pointing to user record,
out: cursor on node pointer record,
its page x-latched */
mtr_t* mtr) /*!< in: mtr */
{
const uint32_t page_no = btr_cur_get_block(cursor)->page.id().page_no();
dict_index_t* index = btr_cur_get_index(cursor);
ut_ad(!index->is_spatial());
ut_ad(mtr->memo_contains_flagged(&index->lock, MTR_MEMO_X_LOCK
| MTR_MEMO_SX_LOCK));
ut_ad(dict_index_get_page(index) != page_no);
const auto level = btr_page_get_level(btr_cur_get_page(cursor));
const rec_t* user_rec = btr_cur_get_rec(cursor);
ut_a(page_rec_is_user_rec(user_rec));
if (btr_cur_search_to_nth_level(level + 1,
dict_index_build_node_ptr(index,
user_rec, 0,
heap, level),
RW_X_LATCH,
cursor, mtr) != DB_SUCCESS) {
return nullptr;
}
const rec_t* node_ptr = btr_cur_get_rec(cursor);
ut_ad(!btr_cur_get_block(cursor)->page.lock.not_recursive()
|| mtr->memo_contains(index->lock, MTR_MEMO_X_LOCK));
offsets = rec_get_offsets(node_ptr, index, offsets, 0,
ULINT_UNDEFINED, &heap);
if (btr_node_ptr_get_child_page_no(node_ptr, offsets) != page_no) {
offsets = nullptr;
}
return(offsets);
}
static bool btr_page_search_father(mtr_t *mtr, btr_cur_t *cursor)
{
rec_t *rec=
page_rec_get_next(page_get_infimum_rec(cursor->block()->page.frame));
if (UNIV_UNLIKELY(!rec))
return false;
cursor->page_cur.rec= rec;
mem_heap_t *heap= mem_heap_create(100);
const bool got= btr_page_search_father_node_ptr(nullptr, heap, cursor, mtr);
mem_heap_free(heap);
return got;
}
/*************************************************************//** /*************************************************************//**
Merge as many records from the from_block to the to_block. Delete Merge as many records from the from_block to the to_block. Delete
the from_block if all records are successfully merged to to_block. the from_block if all records are successfully merged to to_block.
...@@ -408,7 +472,7 @@ btr_defragment_merge_pages( ...@@ -408,7 +472,7 @@ btr_defragment_merge_pages(
parent.page_cur.index = index; parent.page_cur.index = index;
parent.page_cur.block = from_block; parent.page_cur.block = from_block;
if (!btr_page_get_father(mtr, &parent)) { if (!btr_page_search_father(mtr, &parent)) {
to_block = nullptr; to_block = nullptr;
} else if (n_recs_to_move == n_recs) { } else if (n_recs_to_move == n_recs) {
/* The whole page is merged with the previous page, /* The whole page is merged with the previous page,
...@@ -699,10 +763,9 @@ static void btr_defragment_chunk(void*) ...@@ -699,10 +763,9 @@ static void btr_defragment_chunk(void*)
acquire index->lock X-latch. This entitles us to acquire index->lock X-latch. This entitles us to
acquire page latches in any order for the index. */ acquire page latches in any order for the index. */
mtr_x_lock_index(index, &mtr); mtr_x_lock_index(index, &mtr);
/* This will acquire index->lock U latch, which is allowed
when we are already holding the X-latch. */
if (buf_block_t *last_block = if (buf_block_t *last_block =
item->pcur->restore_position(BTR_MODIFY_TREE, &mtr) item->pcur->restore_position(
BTR_PURGE_TREE_ALREADY_LATCHED, &mtr)
== btr_pcur_t::CORRUPTED == btr_pcur_t::CORRUPTED
? nullptr ? nullptr
: btr_defragment_n_pages(btr_pcur_get_block(item->pcur), : btr_defragment_n_pages(btr_pcur_get_block(item->pcur),
......
/***************************************************************************** /*****************************************************************************
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2016, 2022, MariaDB Corporation. Copyright (c) 2016, 2023, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -212,24 +212,98 @@ btr_pcur_copy_stored_position( ...@@ -212,24 +212,98 @@ btr_pcur_copy_stored_position(
pcur_receive->old_n_fields = pcur_donate->old_n_fields; pcur_receive->old_n_fields = pcur_donate->old_n_fields;
} }
/** Optimistically latches the leaf page or pages requested.
@param[in] block guessed buffer block
@param[in,out] pcur cursor
@param[in,out] latch_mode BTR_SEARCH_LEAF, ...
@param[in,out] mtr mini-transaction
@return true if success */
TRANSACTIONAL_TARGET
static bool btr_pcur_optimistic_latch_leaves(buf_block_t *block,
btr_pcur_t *pcur,
btr_latch_mode *latch_mode,
mtr_t *mtr)
{
ut_ad(block->page.buf_fix_count());
ut_ad(block->page.in_file());
ut_ad(block->page.frame);
static_assert(BTR_SEARCH_PREV & BTR_SEARCH_LEAF, "");
static_assert(BTR_MODIFY_PREV & BTR_MODIFY_LEAF, "");
static_assert((BTR_SEARCH_PREV ^ BTR_MODIFY_PREV) ==
(RW_S_LATCH ^ RW_X_LATCH), "");
const rw_lock_type_t mode=
rw_lock_type_t(*latch_mode & (RW_X_LATCH | RW_S_LATCH));
switch (*latch_mode) {
default:
ut_ad(*latch_mode == BTR_SEARCH_LEAF || *latch_mode == BTR_MODIFY_LEAF);
return buf_page_optimistic_get(mode, block, pcur->modify_clock, mtr);
case BTR_SEARCH_PREV:
case BTR_MODIFY_PREV:
page_id_t id{0};
uint32_t left_page_no;
ulint zip_size;
buf_block_t *left_block= nullptr;
{
transactional_shared_lock_guard<block_lock> g{block->page.lock};
if (block->modify_clock != pcur->modify_clock)
return false;
id= block->page.id();
zip_size= block->zip_size();
left_page_no= btr_page_get_prev(block->page.frame);
}
if (left_page_no != FIL_NULL)
{
left_block=
buf_page_get_gen(page_id_t(id.space(), left_page_no), zip_size,
mode, nullptr, BUF_GET_POSSIBLY_FREED, mtr);
if (left_block &&
btr_page_get_next(left_block->page.frame) != id.page_no())
{
release_left_block:
mtr->release_last_page();
return false;
}
}
if (buf_page_optimistic_get(mode, block, pcur->modify_clock, mtr))
{
if (btr_page_get_prev(block->page.frame) == left_page_no)
{
/* block was already buffer-fixed while entering the function and
buf_page_optimistic_get() buffer-fixes it again. */
ut_ad(2 <= block->page.buf_fix_count());
*latch_mode= btr_latch_mode(mode);
return true;
}
mtr->release_last_page();
}
ut_ad(block->page.buf_fix_count());
if (left_block)
goto release_left_block;
return false;
}
}
/** Structure acts as functor to do the latching of leaf pages. /** Structure acts as functor to do the latching of leaf pages.
It returns true if latching of leaf pages succeeded and false It returns true if latching of leaf pages succeeded and false
otherwise. */ otherwise. */
struct optimistic_latch_leaves struct optimistic_latch_leaves
{ {
btr_pcur_t *const cursor; btr_pcur_t *const cursor;
btr_latch_mode *latch_mode; btr_latch_mode *const latch_mode;
mtr_t *const mtr; mtr_t *const mtr;
optimistic_latch_leaves(btr_pcur_t *cursor, btr_latch_mode *latch_mode, bool operator()(buf_block_t *hint) const
mtr_t *mtr)
: cursor(cursor), latch_mode(latch_mode), mtr(mtr) {}
bool operator() (buf_block_t *hint) const
{ {
return hint && btr_cur_optimistic_latch_leaves( return hint &&
hint, cursor->modify_clock, latch_mode, btr_pcur_optimistic_latch_leaves(hint, cursor, latch_mode, mtr);
btr_pcur_get_btr_cur(cursor), mtr);
} }
}; };
...@@ -246,8 +320,8 @@ record GREATER than the user record which was the predecessor of the ...@@ -246,8 +320,8 @@ record GREATER than the user record which was the predecessor of the
supremum. supremum.
(4) cursor was positioned before the first or after the last in an (4) cursor was positioned before the first or after the last in an
empty tree: restores to before first or after the last in the tree. empty tree: restores to before first or after the last in the tree.
@param restore_latch_mode BTR_SEARCH_LEAF, ... @param latch_mode BTR_SEARCH_LEAF, ...
@param mtr mtr @param mtr mini-transaction
@return btr_pcur_t::SAME_ALL cursor position on user rec and points on @return btr_pcur_t::SAME_ALL cursor position on user rec and points on
the record with the same field values as in the stored record, the record with the same field values as in the stored record,
btr_pcur_t::SAME_UNIQ cursor position is on user rec and points on the btr_pcur_t::SAME_UNIQ cursor position is on user rec and points on the
...@@ -301,10 +375,9 @@ btr_pcur_t::restore_position(btr_latch_mode restore_latch_mode, mtr_t *mtr) ...@@ -301,10 +375,9 @@ btr_pcur_t::restore_position(btr_latch_mode restore_latch_mode, mtr_t *mtr)
case BTR_SEARCH_PREV: case BTR_SEARCH_PREV:
case BTR_MODIFY_PREV: case BTR_MODIFY_PREV:
/* Try optimistic restoration. */ /* Try optimistic restoration. */
if (block_when_stored.run_with_hint( if (block_when_stored.run_with_hint(
optimistic_latch_leaves(this, &restore_latch_mode, optimistic_latch_leaves{this, &restore_latch_mode,
mtr))) { mtr})) {
pos_state = BTR_PCUR_IS_POSITIONED; pos_state = BTR_PCUR_IS_POSITIONED;
latch_mode = restore_latch_mode; latch_mode = restore_latch_mode;
...@@ -465,18 +538,9 @@ btr_pcur_move_to_next_page( ...@@ -465,18 +538,9 @@ btr_pcur_move_to_next_page(
return DB_CORRUPTION; return DB_CORRUPTION;
} }
ulint mode = cursor->latch_mode;
switch (mode) {
case BTR_SEARCH_TREE:
mode = BTR_SEARCH_LEAF;
break;
case BTR_MODIFY_TREE:
mode = BTR_MODIFY_LEAF;
}
dberr_t err; dberr_t err;
buf_block_t* next_block = btr_block_get( buf_block_t* next_block = btr_block_get(
*cursor->index(), next_page_no, mode, *cursor->index(), next_page_no, cursor->latch_mode & ~12,
page_is_leaf(page), mtr, &err); page_is_leaf(page), mtr, &err);
if (UNIV_UNLIKELY(!next_block)) { if (UNIV_UNLIKELY(!next_block)) {
...@@ -538,26 +602,42 @@ btr_pcur_move_backward_from_page( ...@@ -538,26 +602,42 @@ btr_pcur_move_backward_from_page(
return true; return true;
} }
buf_block_t* release_block = nullptr; buf_block_t* block = btr_pcur_get_block(cursor);
if (!page_has_prev(btr_pcur_get_page(cursor))) { if (page_has_prev(block->page.frame)) {
} else if (btr_pcur_is_before_first_on_page(cursor)) { buf_block_t* left_block
release_block = btr_pcur_get_block(cursor); = mtr->at_savepoint(mtr->get_savepoint() - 1);
page_cur_set_after_last(cursor->btr_cur.left_block, const page_t* const left = left_block->page.frame;
btr_pcur_get_page_cur(cursor)); if (memcmp_aligned<4>(left + FIL_PAGE_NEXT,
} else { block->page.frame
/* The repositioned cursor did not end on an infimum + FIL_PAGE_OFFSET, 4)) {
record on a page. Cursor repositioning acquired a latch /* This should be the right sibling page, or
also on the previous page, but we do not need the latch: if there is none, the current block. */
release it. */ ut_ad(left_block == block
release_block = cursor->btr_cur.left_block; || !memcmp_aligned<4>(left + FIL_PAGE_PREV,
block->page.frame
+ FIL_PAGE_OFFSET, 4));
/* The previous one must be the left sibling. */
left_block
= mtr->at_savepoint(mtr->get_savepoint() - 2);
ut_ad(!memcmp_aligned<4>(left_block->page.frame
+ FIL_PAGE_NEXT,
block->page.frame
+ FIL_PAGE_OFFSET, 4));
}
if (btr_pcur_is_before_first_on_page(cursor)) {
page_cur_set_after_last(left_block,
&cursor->btr_cur.page_cur);
/* Release the right sibling. */
} else {
/* Release the left sibling. */
block = left_block;
}
mtr->release(*block);
} }
cursor->latch_mode = latch_mode; cursor->latch_mode = latch_mode;
cursor->old_rec = nullptr; cursor->old_rec = nullptr;
if (release_block) {
mtr->release(*release_block);
}
return false; return false;
} }
......
...@@ -1055,26 +1055,24 @@ btr_search_guess_on_hash( ...@@ -1055,26 +1055,24 @@ btr_search_guess_on_hash(
index_id_t index_id; index_id_t index_id;
ut_ad(mtr->is_active()); ut_ad(mtr->is_active());
ut_ad(index->is_btree() || index->is_ibuf());
if (!btr_search_enabled) { /* Note that, for efficiency, the struct info may not be protected by
any latch here! */
if (latch_mode > BTR_MODIFY_LEAF
|| !info->last_hash_succ || !info->n_hash_potential
|| (tuple->info_bits & REC_INFO_MIN_REC_FLAG)) {
return false; return false;
} }
ut_ad(!index->is_ibuf()); ut_ad(index->is_btree());
ut_ad(!index->table->is_temporary());
ut_ad(latch_mode == BTR_SEARCH_LEAF || latch_mode == BTR_MODIFY_LEAF); ut_ad(latch_mode == BTR_SEARCH_LEAF || latch_mode == BTR_MODIFY_LEAF);
compile_time_assert(ulint{BTR_SEARCH_LEAF} == ulint{RW_S_LATCH}); compile_time_assert(ulint{BTR_SEARCH_LEAF} == ulint{RW_S_LATCH});
compile_time_assert(ulint{BTR_MODIFY_LEAF} == ulint{RW_X_LATCH}); compile_time_assert(ulint{BTR_MODIFY_LEAF} == ulint{RW_X_LATCH});
/* Not supported for spatial index */
ut_ad(!dict_index_is_spatial(index));
/* Note that, for efficiency, the struct info may not be protected by
any latch here! */
if (info->n_hash_potential == 0) {
return false;
}
cursor->n_fields = info->n_fields; cursor->n_fields = info->n_fields;
cursor->n_bytes = info->n_bytes; cursor->n_bytes = info->n_bytes;
......
...@@ -2700,6 +2700,18 @@ buf_page_get_low( ...@@ -2700,6 +2700,18 @@ buf_page_get_low(
&& mode != BUF_GET_IF_IN_POOL_OR_WATCH) { && mode != BUF_GET_IF_IN_POOL_OR_WATCH) {
} else if (!ibuf_debug || recv_recovery_is_on()) { } else if (!ibuf_debug || recv_recovery_is_on()) {
} else if (fil_space_t* space = fil_space_t::get(page_id.space())) { } else if (fil_space_t* space = fil_space_t::get(page_id.space())) {
for (ulint i = 0; i < mtr->get_savepoint(); i++) {
if (buf_block_t* b = mtr->block_at_savepoint(i)) {
if (b->page.oldest_modification() > 2
&& b->page.lock.have_any()) {
/* We are holding a dirty page latch
that would hang buf_flush_sync(). */
space->release();
goto re_evict_fail;
}
}
}
/* Try to evict the block from the buffer pool, to use the /* Try to evict the block from the buffer pool, to use the
insert buffer (change buffer) as much as possible. */ insert buffer (change buffer) as much as possible. */
...@@ -2741,9 +2753,9 @@ buf_page_get_low( ...@@ -2741,9 +2753,9 @@ buf_page_get_low(
/* Failed to evict the page; change it directly */ /* Failed to evict the page; change it directly */
} }
re_evict_fail:
#endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */ #endif /* UNIV_DEBUG || UNIV_IBUF_DEBUG */
ut_ad(state > buf_page_t::FREED);
if (UNIV_UNLIKELY(state < buf_page_t::UNFIXED)) { if (UNIV_UNLIKELY(state < buf_page_t::UNFIXED)) {
goto ignore_block; goto ignore_block;
} }
...@@ -2799,8 +2811,7 @@ buf_page_get_low( ...@@ -2799,8 +2811,7 @@ buf_page_get_low(
} }
if (rw_latch == RW_X_LATCH) { if (rw_latch == RW_X_LATCH) {
mtr->memo_push(block, MTR_MEMO_PAGE_X_FIX); goto get_latch_valid;
goto got_latch;
} else { } else {
block->page.lock.x_unlock(); block->page.lock.x_unlock();
goto get_latch; goto get_latch;
...@@ -2808,12 +2819,10 @@ buf_page_get_low( ...@@ -2808,12 +2819,10 @@ buf_page_get_low(
} else { } else {
get_latch: get_latch:
switch (rw_latch) { switch (rw_latch) {
mtr_memo_type_t fix_type;
case RW_NO_LATCH: case RW_NO_LATCH:
mtr->memo_push(block, MTR_MEMO_BUF_FIX); mtr->memo_push(block, MTR_MEMO_BUF_FIX);
return block; return block;
case RW_S_LATCH: case RW_S_LATCH:
fix_type = MTR_MEMO_PAGE_S_FIX;
block->page.lock.s_lock(); block->page.lock.s_lock();
ut_ad(!block->page.is_read_fixed()); ut_ad(!block->page.is_read_fixed());
if (UNIV_UNLIKELY(block->page.id() != page_id)) { if (UNIV_UNLIKELY(block->page.id() != page_id)) {
...@@ -2822,13 +2831,12 @@ buf_page_get_low( ...@@ -2822,13 +2831,12 @@ buf_page_get_low(
goto page_id_mismatch; goto page_id_mismatch;
} }
get_latch_valid: get_latch_valid:
mtr->memo_push(block, fix_type); mtr->memo_push(block, mtr_memo_type_t(rw_latch));
#ifdef BTR_CUR_HASH_ADAPT #ifdef BTR_CUR_HASH_ADAPT
btr_search_drop_page_hash_index(block, true); btr_search_drop_page_hash_index(block, true);
#endif /* BTR_CUR_HASH_ADAPT */ #endif /* BTR_CUR_HASH_ADAPT */
break; break;
case RW_SX_LATCH: case RW_SX_LATCH:
fix_type = MTR_MEMO_PAGE_SX_FIX;
block->page.lock.u_lock(); block->page.lock.u_lock();
ut_ad(!block->page.is_io_fixed()); ut_ad(!block->page.is_io_fixed());
if (UNIV_UNLIKELY(block->page.id() != page_id)) { if (UNIV_UNLIKELY(block->page.id() != page_id)) {
...@@ -2838,7 +2846,6 @@ buf_page_get_low( ...@@ -2838,7 +2846,6 @@ buf_page_get_low(
goto get_latch_valid; goto get_latch_valid;
default: default:
ut_ad(rw_latch == RW_X_LATCH); ut_ad(rw_latch == RW_X_LATCH);
fix_type = MTR_MEMO_PAGE_X_FIX;
if (block->page.lock.x_lock_upgraded()) { if (block->page.lock.x_lock_upgraded()) {
ut_ad(block->page.id() == page_id); ut_ad(block->page.id() == page_id);
block->unfix(); block->unfix();
...@@ -2851,7 +2858,6 @@ buf_page_get_low( ...@@ -2851,7 +2858,6 @@ buf_page_get_low(
goto get_latch_valid; goto get_latch_valid;
} }
got_latch:
ut_ad(page_id_t(page_get_space_id(block->page.frame), ut_ad(page_id_t(page_get_space_id(block->page.frame),
page_get_page_no(block->page.frame)) page_get_page_no(block->page.frame))
== page_id); == page_id);
...@@ -3040,8 +3046,7 @@ bool buf_page_optimistic_get(ulint rw_latch, buf_block_t *block, ...@@ -3040,8 +3046,7 @@ bool buf_page_optimistic_get(ulint rw_latch, buf_block_t *block,
ut_ad(!block->page.is_read_fixed()); ut_ad(!block->page.is_read_fixed());
block->page.set_accessed(); block->page.set_accessed();
buf_page_make_young_if_needed(&block->page); buf_page_make_young_if_needed(&block->page);
mtr->memo_push(block, rw_latch == RW_S_LATCH mtr->memo_push(block, mtr_memo_type_t(rw_latch));
? MTR_MEMO_PAGE_S_FIX : MTR_MEMO_PAGE_X_FIX);
} }
ut_d(if (!(++buf_dbg_counter % 5771)) buf_pool.validate()); ut_d(if (!(++buf_dbg_counter % 5771)) buf_pool.validate());
......
This diff is collapsed.
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2012, Facebook Inc. Copyright (c) 2012, Facebook Inc.
Copyright (c) 2013, 2022, MariaDB Corporation. Copyright (c) 2013, 2023, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -4143,8 +4143,7 @@ void dict_set_corrupted(dict_index_t *index, const char *ctx) ...@@ -4143,8 +4143,7 @@ void dict_set_corrupted(dict_index_t *index, const char *ctx)
dict_index_copy_types(tuple, sys_index, 2); dict_index_copy_types(tuple, sys_index, 2);
cursor.page_cur.index = sys_index; cursor.page_cur.index = sys_index;
if (btr_cur_search_to_nth_level(0, tuple, PAGE_CUR_LE, if (cursor.search_leaf(tuple, PAGE_CUR_LE, BTR_MODIFY_LEAF, &mtr)
BTR_MODIFY_LEAF, &cursor, &mtr)
!= DB_SUCCESS) { != DB_SUCCESS) {
goto fail; goto fail;
} }
...@@ -4219,8 +4218,7 @@ dict_index_set_merge_threshold( ...@@ -4219,8 +4218,7 @@ dict_index_set_merge_threshold(
dict_index_copy_types(tuple, sys_index, 2); dict_index_copy_types(tuple, sys_index, 2);
cursor.page_cur.index = sys_index; cursor.page_cur.index = sys_index;
if (btr_cur_search_to_nth_level(0, tuple, PAGE_CUR_GE, if (cursor.search_leaf(tuple, PAGE_CUR_GE, BTR_MODIFY_LEAF, &mtr)
BTR_MODIFY_LEAF, &cursor, &mtr)
!= DB_SUCCESS) { != DB_SUCCESS) {
goto func_exit; goto func_exit;
} }
......
This diff is collapsed.
/***************************************************************************** /*****************************************************************************
Copyright (c) 2009, 2019, Oracle and/or its affiliates. All Rights Reserved. Copyright (c) 2009, 2019, Oracle and/or its affiliates. All Rights Reserved.
Copyright (c) 2015, 2022, MariaDB Corporation. Copyright (c) 2015, 2023, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software the terms of the GNU General Public License as published by the Free Software
...@@ -1697,7 +1697,7 @@ static dberr_t page_cur_open_level(page_cur_t *page_cur, ulint level, ...@@ -1697,7 +1697,7 @@ static dberr_t page_cur_open_level(page_cur_t *page_cur, ulint level,
static dberr_t btr_pcur_open_level(btr_pcur_t *pcur, ulint level, mtr_t *mtr, static dberr_t btr_pcur_open_level(btr_pcur_t *pcur, ulint level, mtr_t *mtr,
dict_index_t *index) dict_index_t *index)
{ {
pcur->latch_mode= BTR_SEARCH_TREE; pcur->latch_mode= BTR_SEARCH_LEAF;
pcur->search_mode= PAGE_CUR_G; pcur->search_mode= PAGE_CUR_G;
pcur->pos_state= BTR_PCUR_IS_POSITIONED; pcur->pos_state= BTR_PCUR_IS_POSITIONED;
pcur->btr_cur.page_cur.index= index; pcur->btr_cur.page_cur.index= index;
......
...@@ -1474,7 +1474,7 @@ inline void mtr_t::log_file_op(mfile_type_t type, ulint space_id, ...@@ -1474,7 +1474,7 @@ inline void mtr_t::log_file_op(mfile_type_t type, ulint space_id,
ut_ad(strchr(path, '/')); ut_ad(strchr(path, '/'));
ut_ad(!strcmp(&path[strlen(path) - strlen(DOT_IBD)], DOT_IBD)); ut_ad(!strcmp(&path[strlen(path) - strlen(DOT_IBD)], DOT_IBD));
flag_modified(); m_modifications= true;
if (!is_logged()) if (!is_logged())
return; return;
m_last= nullptr; m_last= nullptr;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment