• Sujatha's avatar
    MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query... · a8f6bbb7
    Sujatha authored
    MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query fail sporadically in buildbot with Master command COM_REGISTER_SLAVE failed
    
    Analysis:
    ========
    Slave server will send COM_REGISTER_SLAVE command at the time of establishing
    a connection to master. If master is down, then the command will fail and
    COM_REGISTER_SLAVE failed warning is reported.
    
    'rpl_binlog_index.test' shutsdown the master and it relocates binary logs to a
    new location and attempts to start master by pointing 'log-bin' to new
    location. During this process the slave threads are active. IO thread actively
    checks for the presence of master when it finds that the connection is lost it
    attempts a reconnect, as master is down COM_REGISTER_SLAVE command fails.
    
    As part of fix, stop the slave threads and then shutdown the master and do the
    binlog relocation. Once master is restarted start the slave threads and sync
    them with the master. In test binary logs and index files on master are
    relocated to /tmpdir but during master restart only --log-bin option is
    provided, this is incorrect. Even --log-bin-index also should be pointed to
    /tmpdir otherwise upon master server restart two index files will be created.
    One master-bin.index in /tmpdir and a new master-bin.index as per log_basename
    in datadir. Due to this slave will fail to connect to master.
    
    'rpl_gtid_crash.test' tests following scenario "crashing master, causing slave
    IO thread to reconnect while SQL thread is running". When IO thread tries to
    connect to crashed master on slow platforms COM_REGISTER_SLAVE command fails.
    This is expected hence the warning should be added to suppression list.
    a8f6bbb7
rpl_gtid_crash.test 18.6 KB