• Sujatha's avatar
    MDEV-21117: refine the server binlog-based recovery for semisync · 6c39eaeb
    Sujatha authored
    Problem:
    =======
    When the semisync master is crashed and restarted as slave it could
    recover transactions that former slaves may never have seen.
    A known method existed to clear out all prepared transactions
    with --tc-heuristic-recover=rollback does not care to adjust
    binlog accordingly.
    
    Fix:
    ===
    The binlog-based recovery is made to concern of the slave semisync role of
    post-crash restarted server.
    No changes in behavior is done to the "normal" binloggging server
    and the semisync master.
    
    When the restarted server is configured with
      --rpl-semi-sync-slave-enabled=1
    the refined recovery attempts to roll back prepared transactions
    and truncate binlog accordingly.
    In case of a partially committed (that is committed at least
    in one of the engine participants) such transaction gets committed.
    It's guaranteed no (partially as well) committed transactions
    exist beyond the truncate position.
    In case there exists a non-transactional replication event
    (being in a way a committed transaction) past the
    computed truncate position the recovery ends with an error.
    
    As after master crash and failover to slave, the demoted-to-slave
    ex-master must be ready to face and accept its own (generated by)
    events, without generally necessary --replicate-same-server-id.
    So the acceptance conditions are relaxed for the semisync slave
    to accept own events without that option.
    While gtid_strict_mode ON ensures no duplicate transaction can be
    (re-)executed the master_use_gtid=none slave has to be
    configured with --replicate-same-server-id.
    
    *NOTE* for reviewers.
    
    This patch does not handle the user XA which is done
    in next git commit.
    6c39eaeb
slave.cc 283 KB