Commit 03c9a4ef authored by Haidong Ji's avatar Haidong Ji Committed by Andrew Hutchings

MDEV-29091: Correct event_name in PFS for wait caused by FOR UPDATE

When one session SELECT ... FOR UPDATE and holds the lock, subsequent
sessions that SELECT ... FOR UPDATE will wait to get the lock.
Currently, that event is labeled as `wait/io/table/sql/handler`, which
is incorrect. Instead, it should have been
`wait/lock/table/sql/handler`.

Two factors contribute to this bug:
1. Instrumentation interface and the heavy usage of `TABLE_IO_WAIT` in
   `sql/handler.cc` file. See interface [^1] for better understanding;
2. The balancing act [^2] of doing instrumentation aggregration _AND_
   having good performance. For example, EVENTS_WAITS_SUMMARY... is
   aggregated using EVENTS_WAITS_CURRENT. Aggregration needs to be based
   on the same wait class, and the code was overly aggressive in label a
   LOCK operation as an IO operation in this case.

The proposed fix is pretty simple, but understanding the bug took a
while. Hence the footnotes below.  For future improvement and
refactoring, we may want to consider renaming `TABLE_IO_WAIT` and making
it less coarse and more targeted.

Note that newly added test case, events_waits_current_MDEV-29091,
initially didn't pass Buildbot CI for embedded build tests.  Further
research showed that other impacted tests all included not_embedded.inc.
This oversight was fixed later.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license.  I am contributing on behalf of my employer Amazon Web
Services, Inc.

[^1]: To understand `performance_schema` instrumentation interface, I
found this URL is the most helpful:
https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_PFS_PSI.html
[^2]: The best place to understand instrumentation projection,
composition, and aggregration is through the source file. Although I
prefer reading Doxygen produced html file, but for whatever reason, the
rendering is not ideal. Here is link to 10.6's pfs.cc:
https://github.com/MariaDB/server/blob/10.6/storage/perfschema/pfs.cc
parent fab16653
SET default_storage_engine=InnoDB;
SELECT @save_instrument_enabled := ENABLED
, @save_instrument_timed := TIMED
FROM performance_schema.setup_instruments
WHERE NAME = 'wait/lock/table/sql/handler';
@save_instrument_enabled := ENABLED @save_instrument_timed := TIMED
YES YES
SELECT @save_consumer_enabled := ENABLED
FROM performance_schema.setup_consumers
WHERE NAME = 'events_waits_current';
@save_consumer_enabled := ENABLED
YES
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME = 'wait/lock/table/sql/handler';
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME = 'events_waits_current';
CREATE TABLE t1 (id1 INT(11), col1 VARCHAR (200));
INSERT INTO t1 VALUES (1, 'aa');
INSERT INTO t1 VALUES (2, 'bb');
connect con1,localhost,root,,test;
connect con2,localhost,root,,test;
connection con1;
START TRANSACTION;
connection con2;
START TRANSACTION;
SELECT id1 FROM t1 WHERE id1=1 FOR UPDATE;
connection default;
SELECT event_name FROM performance_schema.events_waits_current
WHERE event_name LIKE '%wait/lock/table/sql/handler%';
event_name
UPDATE performance_schema.setup_instruments
SET ENABLED = @save_instrument_enabled, TIMED = @save_instrument_timed
WHERE NAME = 'wait/lock/table/sql/handler';
UPDATE performance_schema.setup_consumers
SET ENABLED = @save_consumer_enabled
WHERE NAME = 'events_waits_current';
disconnect con1;
disconnect con2;
DROP TABLE t1;
#
# proper event name wait/lock/table/sql/handler recorded in
# PERFORMANCE_SCHEMA.EVENTS_WAITS_CURRENT. Before this fix, it was
# labeled as wait/io/table/sql/handler.
#
--source include/have_innodb.inc
--source include/have_perfschema.inc
--source include/not_embedded.inc
SET default_storage_engine=InnoDB;
SELECT @save_instrument_enabled := ENABLED
, @save_instrument_timed := TIMED
FROM performance_schema.setup_instruments
WHERE NAME = 'wait/lock/table/sql/handler';
SELECT @save_consumer_enabled := ENABLED
FROM performance_schema.setup_consumers
WHERE NAME = 'events_waits_current';
UPDATE performance_schema.setup_instruments
SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME = 'wait/lock/table/sql/handler';
UPDATE performance_schema.setup_consumers
SET ENABLED = 'YES'
WHERE NAME = 'events_waits_current';
CREATE TABLE t1 (id1 INT(11), col1 VARCHAR (200));
INSERT INTO t1 VALUES (1, 'aa');
INSERT INTO t1 VALUES (2, 'bb');
connect (con1,localhost,root,,test);
connect (con2,localhost,root,,test);
connection con1;
START TRANSACTION;
let $wait_condition=
SELECT id1 FROM t1 WHERE id1=1 FOR UPDATE;
--source include/wait_condition.inc
connection con2;
START TRANSACTION;
send SELECT id1 FROM t1 WHERE id1=1 FOR UPDATE;
connection default;
SELECT event_name FROM performance_schema.events_waits_current
WHERE event_name LIKE '%wait/lock/table/sql/handler%';
# clean up
UPDATE performance_schema.setup_instruments
SET ENABLED = @save_instrument_enabled, TIMED = @save_instrument_timed
WHERE NAME = 'wait/lock/table/sql/handler';
UPDATE performance_schema.setup_consumers
SET ENABLED = @save_consumer_enabled
WHERE NAME = 'events_waits_current';
disconnect con1;
disconnect con2;
DROP TABLE t1;
......@@ -2610,7 +2610,8 @@ start_table_io_wait_v1(PSI_table_locker_state *state,
PFS_table_share *share= pfs_table->m_share;
wait->m_thread= pfs_thread;
wait->m_class= &global_table_io_class;
if (wait->m_class == NULL || wait->m_class->m_type != PFS_CLASS_TABLE_LOCK)
wait->m_class= &global_table_io_class;
wait->m_timer_start= timer_start;
wait->m_timer_end= 0;
wait->m_object_instance_addr= pfs_table->m_identity;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment