Commit 79e3ee00 authored by Brandon Nesterenko's avatar Brandon Nesterenko

MDEV-4989: Support for GTID in mysqlbinlog

New Feature:
===========
This commit extends the mariadb-binlog capabilities to allow events
to be filtered by GTID ranges. More specifically, the
--start-position and --stop-position arguments have been extended to
accept values formatted as a list of GTID positions, e.g.
--start-position=0-1-0,1-2-55. The following specific capabilities
are addressed:
   1) GTIDs can be used to filter results on local binlog files
   2) GTIDs can be used to filter results from remote servers
   3) Implemented --gtid-strict-mode that ensures the GTID event
      stream in each domain is monotonically increasing
   4) Added new level of verbosity in mysqlbinlog -vvv to print
      additional diagnostic information/warnings about invalid GTID
      states
   5) For a given GTID range, its start and stop position parameters
      aim to mimic the behaviors of
      CHANGE MASTER TO MASTER_USE_GTID=slave_pos and
      START SLAVE UNTIL master_gtid_pos=<GTID>, respectively. In
      particular, the start-position list expresses a gtid state of
      the server, similarly to how @@global.gtid_slave_pos expresses
      the gtid state of a slave server when connecting to a master
      with MASTER_USE_GTID=slave_pos.
      The GTID start-position list is exclusive and the
      stop-position list is inclusive. This allows users to receive
      events strictly after those that they already have, and is
      useful in  cases of point in (logical) time recovery including
      1) events were received out of order and should be re-sent, or
      2) specifying the gtid state of a slave to get events newer
      than their current state. If a seq_no is 0 for start-position,
      it means to include the entirety of the domain. If a seq_no is
      0 for stop-position, it means to exclude all events from that
      domain. The GTIDs provided in a start position argument must
      match with the GTID state of the first processed log (i.e.
      those listed in the Gtid_list event). If a stop position is
      provided, the events that are output are limited to only those
      with domain ids listed in the argument. When specifying
      combinations of start and stop positions, the following
      behaviors are expected:

[--start-position without --stop-position]: Events that have domain
ids in the start position are output if their seq_no occurs after
the respective start position. Events with domain ids that are
unspecified in the start position list are also output. Note that if
the Gtid_list event of the first binary log is populated (i.e.
non-empty), each domain in the Gtid_list must be present in the
start-position list with a seq_no at or after the listed value.
This behavior mimics how a slave only processes events after the
state provided by @@global.gtid_slave_pos when connecting to a
master with CHANGE MASTER TO MASTER_USE_GTID=slave_pos.

[--stop-position without --start-position]: Output is limited to
only events with both 1) domain ids that are present in the given
stop position list and 2) seq_nos that are less than or equal to
their respective stop GTID. Once all GTIDs in the stop position
list have been processed, the program will stop processing log
files. This behavior mimics how
START SLAVE UNTIL master_gtid_pos=<G>
has a slave only process events with domain ids present in G with
their seq_nos at or before the respective gtid.

[--start-position and --stop-position]: Output consists of the
intersection between the events permitted by both the start and stop
position rules. More concretely, the output can be defined by a
union of the following rules:

  1. For domains which exist in both the start and stop position
     lists, the events which exist in-between these positions
     (exclusive start, inclusive stop) are output
  2. For all other events, the rules of
     [--stop-position without --start-position] are followed

This is due to the implicit filtering within each individual rule.
Even though the start position rule always includes events from
unspecified domains, the stop position rule takes precedence because
it always excludes events from unspecified domains. In other words,
events which the start position rule would have included would then
always be excluded by the stop position rule.

[neither --start-position nor --stop-position]: Events are not
omitted based on GTID positioning; however, --gtid-strict-mode and
-vvv can still analyze gtid correctness for warning and error
reporting.

[repeated specification of --start-position or --stop-position]:
Subsequent specifications of start and stop positions completely
override previous ones. E.g., if invoked as
mysqlbinlog --start-position=<G1> --start-position=<G2> ...
All GTIDs specified in G1 are ignored and only those specified in G2
are used for the start position.

A few additional notes:
 1) this commit squashes together the commits:
f4319661120e-78a9d49907ba

 2) Changed rpl.rpl_blackhole_row_annotate test because it has
out of order GTIDs in its binlog, so I added
--skip-gtid-strict-mode

 3) After all binlog events have been written, the session server
    id and domain id are reset to their values in the global state

Reviewed By:
===========
Andrei Elkin: <andrei.elkin@mariadb.com>
parent 343134fc
This diff is collapsed.
......@@ -992,8 +992,15 @@ This option is useful for point\-in\-time recovery\&.
\fB\-\-start\-position=\fR\fB\fIN\fR\fR,
\fB\-j \fR\fB\fIN\fR\fR
.sp
Start reading the binary log at the first event having a position equal to or greater than
\fIN\fR\&. This option applies to the first log file named on the command line\&.
Start reading the binary log at \fIN\fR\&. Type can either be a positive
integer or a GTID\& list\&. When using a positive integer, the value only
applies to the first binlog passed on the command line, and the first event
that has a position equal to or greater than \fIN\fR is printed\&. In GTID mode,
multiple GTIDs can be passed as a comma separated list, where each must have a
unique domain id\&. The list represents the gtid binlog state that the client
(another "replica" server) is aware of\&. Therefore, each GTID is exclusive; only
events after a given sequence number will be printed to allow users to receive
events after their current state\&.
.sp
This option is useful for point\-in\-time recovery\&.
.RE
......@@ -1006,6 +1013,23 @@ This option is useful for point\-in\-time recovery\&.
.sp -1
.IP \(bu 2.3
.\}
.\" mysqlbinlog: gtid-strict-mode
.\" gtid-strict-mode option: mysqlbinlog
\fB\-\-gtid\-strict\-mode
.sp
Process binlog according to gtid-strict-mode specification\&. The start, stop
positions are verified to satisfy start < stop comparison condition\&. Sequence
numbers of any gtid domain must comprise monotically growing sequence\&.
.RE
.sp
.RS 4
.ie n \{\
\h'-04'\(bu\h'+03'\c
.\}
.el \{\
.sp -1
.IP \(bu 2.3
.\}
.\" mysqlbinlog: stop-datetime option
.\" stop-datetime option: mysqlbinlog
\fB\-\-stop\-datetime=\fR\fB\fIdatetime\fR\fR
......@@ -1063,8 +1087,13 @@ The slave server_id used for \fB--read-from-remote-server --stop-never\fR\&.
.\" stop-position option: mysqlbinlog
\fB\-\-stop\-position=\fR\fB\fIN\fR\fR
.sp
Stop reading the binary log at the first event having a position equal to or greater than
\fIN\fR\&. This option applies to the last log file named on the command line\&.
Stop reading the binary log at the first event having a position equal to or
greater than \fIN\fR\&. Type can either be a positive integer or a GTID
list\&. When using a positive integer, the value only applies to the last log
file named on the command line\&. When in GTID mode, multiple GTIDs can be
passed as a comma separated list, where each must have a unique domain id\&.
Each GTID is inclusive; only events up to the given sequence numbers are
printed.
.sp
This option is useful for point\-in\-time recovery\&.
.RE
......@@ -1133,6 +1162,7 @@ The MariaDB username to use when connecting to a remote server\&.
\fB\-v\fR
.sp
Reconstruct row events and display them as commented SQL statements\&. If this option is given twice, the output includes comments to indicate column data types and some metadata\&.
If this option is given three times, the output includes diagnostic warnings about event integrity before program exit\&.
.sp
For examples that show the effect of
\fB\-\-base64\-output\fR
......
#
# Purpose:
#
# This test ensures that the mariadb-binlog CLI tool properly displays errors
# and warnings for out of order GTIDs.
#
#
# Methodology:
#
# We simulate invalid sequence numberings by manually changing gtid_seq_no to
# differ from its expected linear sequence. Specifically, we test the following
# cases:
# Test Case 1) Sequential sequence numbers results in no warnings
# Test Case 2) A skipped sequence number results in no warnings if all numbers
# are monotonic (i.e. gaps in sequence number are allowed
# provided they never decrease)
# Test Case 3) A sequence number lower than the last processed value results
# in a warning
# Test Case 4) Skipping a GTID and later receiving it results in a warning
# Test Case 5) Repeat sequence numbers result in a warning
# Test Case 6) Warnings from different domains are all displayed
# Test Case 7) A decreasing seq_no before a start-position is ignored
# Test Case 8) A decreasing seq_no inside of a --start/--stop position window
# is displayed
# Test Case 9) Error if --stop-position is not greater than or equal to
# --start-position
# Test Case 10) Strict mode warnings should be independent of --offset option
# specification
# Test Case 11) Strict mode warnings should be independent of
# --start-timestamp option specification
# Test Case 12) Specifying multiple binary logs with a log-position start
# should skip GTID state verification
# Test Case 13) If multiple binary logs should be specified but a middle log
# is missing, we should detect that and warn when using -vvv
# Test Case 14) If a --stop-position GTID occurs before the first specified
# binlog's GLLE, error
#
# Note that test cases are tested under three scenarios:
# 1) --gtid-strict-mode should error and immediately quit with error on out of
# order GTIDs
# 2) --skip-gtid-strict-mode -vvv should not quit early or with error when
# encountering out of order GTIDs; however should produce warnings after
# binlog processing
# 3) --skip-gtid-strict-mode should neither produce errors nor warnings when
# encountering out of order GTIDs
#
# References:
# MDEV-4989: Support for GTID in mysqlbinlog
#
--source include/have_log_bin.inc
--echo ###############################
--echo # Test Setup
--echo ###############################
## Save old state
#
let orig_gtid_domain_id = `select @@session.gtid_domain_id`;
let orig_server_id = `select @@session.server_id`;
RESET MASTER;
--echo ####################################################
--echo # Test Case Group 1
--echo #
--echo # Tests with --gtid-strict-mode should error and
--echo # immediately quit with error on out of order GTIDs
--echo ####################################################
--let is_strict_mode= 1
--let is_verbose= 0
--let DEFAULT_ERROR_PREFIX=ERROR
--source include/mysqlbinlog_gtid_strict_mode.inc
--echo ####################################################
--echo # Test Case Group 2
--echo #
--echo # Test cases with --skip-gtid-strict-mode -vvv
--echo # should not quit early or with error when
--echo # encountering out of order GTIDs; however should
--echo # produce warnings after binlog processing
--echo ####################################################
--let is_strict_mode= 0
--let is_verbose= 1
--let DEFAULT_ERROR_PREFIX=WARNING
--source include/mysqlbinlog_gtid_strict_mode.inc
--echo ####################################################
--echo # Test Case Group 3
--echo #
--echo # Run test cases with --skip-gtid-strict-mode should
--echo # neither produce errors nor warnings when
--echo # encountering out of order GTIDs
--echo ####################################################
--let is_strict_mode= 0
--let is_verbose= 0
--let DEFAULT_ERROR_PREFIX=(ERROR|WARNING)
--source include/mysqlbinlog_gtid_strict_mode.inc
--echo ##############################
--echo # Cleanup
--echo ##############################
--eval SET @@global.gtid_domain_id= $orig_gtid_domain_id
--eval SET @@global.server_id= $orig_server_id
--echo End of the tests
#
# Purpose:
#
# This test ensures that the mariadb-binlog CLI tool can filter log events
# using GTID ranges. More specifically, this test ensures the following
# capabilities:
# 1) GTIDs can be used to filter results on local binlog files
# 2) GTIDs can be used to filter results from remote servers
# 3) For a given GTID range, its start-position is exclusive and its
# stop-position is inclusive. This allows users to receive events strictly
# after what they already have.
# 4) After the events have been written, the session server id and domain id
# are reset to their former values
#
#
# Methodology:
#
# This test validates the expected capabilities using the following test cases
# on both a local binlog file and remote server for all binlog formats.
# Test Case 1) The end of the binlog file resets the server and domain id of
# the session
# Test Case 2) Single GTID range specified
# Test Case 3) Single GTID range with different server_ids
# Test Case 4) Multiple GTID ranges specified
# Test Case 5) Multiple GTID ranges specified where the domain ids are
# listed in different orders between start/stop position
# Test Case 6) Only start position specified
# Test Case 7) Only stop position specified
# Test Case 8) Seq_no=0 in --start-position includes all events for a domain
# Test Case 9) Seq_no=0 in --stop-position excludes all events for a domain
# Test Case 10) Output stops for all domain ids when all --stop-position GTID
# values have been hit.
# Test Case 11) All GTID events from other domains are printed until
# the --stop-position values are hit
# Test Case 12) Scalar and GTID values can be used together for stop or start
# position
# Test Case 13) If the start position is delayed within the binlog, events
# occurring before that position are ignored
# Test Case 14) If start position is repeated, the last specification
# overrides all previous ones
# Test Case 15) If stop position is repeated, the last specification
# overrides all previous ones
# Test Case 16) Start position with --offset=<n> skips n events after the
# first GTID is found
# Test Case 17) Start position with --start-datetime=<T> where T occurs
# after the specified GTID results in no events before T
# Test Case 18) If --stop-position is specified, domains which are not present
# in its list should be excluded from output
# Test Case 19) If the start and stop GTIDs in any domain are equal, the
# domain should not have any output
# Test Case 20) If --start-position and --stop-position have different domain
# ids, only events from GTIDs in the --stop-position list are
# output
# Test Case 21) Successive binary logs (e.g. logs with previous logs that
# have been purged) will write events when the --start-position
# matches their Gtid_list_log_event state
# Test Case 22) Successive binary logs can be called with --stop-position and
# without --start-position
#
# To validate for data consistency, each test case compares a checksum of
# correct data against a variant created after replaying the binlog using
# --(start|stop)-position. If the checksums are identical, the test passes.
# If the checksums differ, data has been changed and the test fails.
#
# Additionally, this test validates the following error scenarios:
# Error Case 1) A GTID --start-position that does not mention all domains
# that make up the binary log state should error
# Error Case 2) A GTID --start-position with any sequence numbers which
# occur before the binary log state should result in error
# Error Case 3) A GTID --start-position with any sequence numbers that are not
# eventually processed results in error
# Error Case 4) User provides invalid positions
# Error Case 5) User provides GTID ranges with repeated domain ids
#
# References:
# MDEV-4989: Support for GTID in mysqlbinlog
#
--source include/have_log_bin.inc
--echo ###############################
--echo # Test Setup
--echo ###############################
## Save old state
#
let orig_gtid_domain_id = `select @@session.gtid_domain_id`;
let orig_server_id = `select @@session.server_id`;
RESET MASTER;
--echo ######################################
--echo # Test Group 1
--echo # Run test cases on local log file
--echo ######################################
--let is_remote= 0
--source include/mysqlbinlog_gtid_window_test_cases.inc
--echo ######################################
--echo # Test Group 2
--echo # Run test cases on remote host
--echo ######################################
--let is_remote= 1
--source include/mysqlbinlog_gtid_window_test_cases.inc
# Note that error cases 1-3 are in mysqlbinlog_gtid_window_test_cases.inc
# because we validate for error consistency of GTID state between
# mariadb-binlog working on local files and receiving errors from a server
--let err_out_= $MYSQLTEST_VARDIR/tmp/err.out
--let tmp_out_= $MYSQLTEST_VARDIR/tmp/std.out
--let $MYSQLD_DATADIR=`select @@datadir`
--echo #
--echo # Error Case 4:
--echo # User provides invalid positions
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=z
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=z
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=1-
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=1-
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=1-2
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=1-2
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=1-2-
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=1-2-
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=-1
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=-1
--echo #
--echo # Error Case 5:
--echo # User provides GTID ranges with repeated domain ids
--echo # MYSQL_BINLOG MYSQLD_DATADIR/master-bin.000001 --start-position=0-1-1,0-1-8 --stop-position=0-1-4,0-1-12
--error 1
--exec $MYSQL_BINLOG $MYSQLD_DATADIR/master-bin.000001 --start-position=0-1-1,0-1-8 --stop-position=0-1-4,0-1-12
--echo ##############################
--echo # Cleanup
--echo ##############################
--eval SET @@global.gtid_domain_id= $orig_gtid_domain_id
--eval SET @@global.server_id= $orig_server_id
--echo # End of the tests
......@@ -44,6 +44,6 @@ FLUSH LOGS;
let $MYSQLD_DATADIR= `select @@datadir`;
--replace_regex /server id [0-9]*/server id #/ /server v [^ ]*/server v #.##.##/ /exec_time=[0-9]*/exec_time=#/ /thread_id=[0-9]*/thread_id=#/ /table id [0-9]*/table id #/ /mapped to number [0-9]*/mapped to number #/ /end_log_pos [0-9]*/end_log_pos #/ /# at [0-9]*/# at #/ /CRC32 0x[0-9a-f]*/CRC32 XXX/ /xid=\d*/xid=<xid>/
--exec $MYSQL_BINLOG --base64-output=decode-rows $MYSQLD_DATADIR/slave-bin.000001
--exec $MYSQL_BINLOG --base64-output=decode-rows --skip-gtid-strict-mode $MYSQLD_DATADIR/slave-bin.000001
source include/rpl_end.inc;
......@@ -920,6 +920,20 @@ typedef struct st_print_event_info
IO_CACHE review_sql_cache;
#endif
FILE *file;
/*
Used to include the events within a GTID start/stop boundary
*/
my_bool m_is_event_group_active;
/*
Tracks whether or not output events must be explicitly activated in order
to be printed
*/
my_bool m_is_event_group_filtering_enabled;
st_print_event_info();
~st_print_event_info() {
......@@ -942,6 +956,40 @@ typedef struct st_print_event_info
copy_event_cache_to_file_and_reinit(&body_cache, file);
fflush(file);
}
/*
Notify that all events part of the current group should be printed
*/
void activate_current_event_group()
{
m_is_event_group_active= TRUE;
}
void deactivate_current_event_group()
{
m_is_event_group_active= FALSE;
}
/*
Used for displaying events part of an event group.
Returns TRUE when both event group filtering is enabled and the current
event group should be displayed, OR if event group filtering is
disabled. More specifically, if filtering is disabled, all events
should be shown.
Returns FALSE when event group filtering is enabled and the current event
group is filtered out.
*/
my_bool is_event_group_active()
{
return m_is_event_group_filtering_enabled ? m_is_event_group_active : TRUE;
}
/*
Notify that events must be explicitly activated in order to be printed
*/
void enable_event_group_filtering()
{
m_is_event_group_filtering_enabled= TRUE;
}
} PRINT_EVENT_INFO;
#endif
......
......@@ -3791,6 +3791,8 @@ st_print_event_info::st_print_event_info()
printed_fd_event=FALSE;
file= 0;
base64_output_mode=BASE64_OUTPUT_UNSPEC;
m_is_event_group_active= TRUE;
m_is_event_group_filtering_enabled= FALSE;
open_cached_file(&head_cache, NULL, NULL, 0, flags);
open_cached_file(&body_cache, NULL, NULL, 0, flags);
open_cached_file(&tail_cache, NULL, NULL, 0, flags);
......
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment