-
Andrei authored
XA-Prepare group of events XA START xid ... XA END xid XA PREPARE xid and its XA-"complete" terminator XA COMMIT or XA ROLLBACK are made distributed Round-Robin across slave parallel workers. The former hash-based policy was proven to attribute to execution latency through creating a big - many times larger than the size of the worker pool - queue of binlog-ordered transactions to commit. Acronyms and notations used below: XAP := XA-Prepare event or the whole prepared XA group of events XAC := XA-"complete", which is a solitary group of events |W| := the size of the slave worker pool Subscripts like `_k' denote order in a corresponding sequence (e.g binlog file). KEY CHANGES: The parallel slave ------------------ driver thread now maintains a list XAP:s currently in processing. It's purpose is to avoid "wild" parallel execution of XA:s with duplicate xids (unlikely, but that's the user's right). The list is arranged as a sliding window with the size of 2*|W| to account a possibility of XAP_k -> XAP_k+2|W|-1 the largest (in the group-of-events count sense) dependency. Say k=1, and |W| the # of Workers is 4. As transactions are distributed Round-Robin, it's possible to have T^*_1 -> T^*_8 as the largest dependency ('*' marks the dependents) in runtime. It can be seen from worker queues, like in the picture below. Let Q_i worker queues develop downward: Q1 ... Q4 1^* 2 3 4 5 6 7 8^* Worker # 1 has assigned with T_1 and T_5. Worker #4 can take on its T_8 when T_1 is yet at the beginning of its processing, so even before XA START of that XAP. XA related ---------- XID_cache_element is extended with two pointers to resolve two types of dependencies: the duplicate xid XAP_k -> XAP_k+i and the ordinary completion on the prepare XAP_k -> XAC_k+j. The former is handled by a wait-for-xid protocol conducted by xid_cache_delete() and xid_cache_insert_maybe_wait(). The later is done analogously by xid_cache_search_maybe_wait() and slave_applier_reset_xa_trans(). XA-"complete" are allowed to go forward before its XAP parent has released the xid (all recovery concerns are covered in MDEV-21496, MDEV-21777). Yet XAC is going to wait for it at a critical point of execution which is at "complete" the work in Engine. CAVEAT: storage/innobase/trx/trx0undo.cc changes are due to possibly fixed MDEV-32144, TODO: to be verified. Thanks to Brandon Nesterenko at mariadb.com for initial review and a lot of creative efforts to advance with this work!
96bd9e6b