• Sujatha's avatar
    MDEV-25502: rpl.rpl_perfschema_applier_status_by_worker failed in bb with: Test assertion failed · fe945067
    Sujatha authored
    Problem:
    =======
    Test fails with 3 different symptoms
    connection slave;
    Assertion text: 'Last_Seen_Transaction should show .'
    Assertion condition: '"0-1-1" = ""'
    Assertion condition, interpolated: '"0-1-1" = ""'
    Assertion result: '0'
    
    connection slave;
    Assertion text: 'Value returned by SSS and PS table for Last_Error_Number
                     should be same.'
    Assertion condition: '"1146" = "0"'
    Assertion condition, interpolated: '"1146" = "0"'
    Assertion result: '0'
    
    connection slave;
    Assertion text: 'Value returned by PS table for worker_idle_time should be
                    >= 1'
    Assertion condition: '"0" >= "1"'
    Assertion condition, interpolated: '"0" >= "1"'
    Assertion result: '0'
    
    Fix1:
    ====
    Performance schema table's Last_Seen_Transaction is compared with 'SELECT
    gtid_slave_pos'. Since DDLs are not transactional changes to user table and
    gtid_slave_pos table are not guaranteed to be synchronous. To fix the
    issue Gtid_IO_Pos value from SHOW SLAVE STATUS command will be used to
    verify the correctness of Performance schema specific
    Last_Seen_Transaction.
    
    Fix2:
    ====
    On error worker thread information is stored as part of backup pool. Access
    to this backup pool should be protected by 'LOCK_rpl_thread_pool' mutex so
    that simultaneous START SLAVE cannot destroy the backup pool, while it is
    being queried by performance schema.
    
    Fix3:
    ====
    When a worker is waiting for events if performance schema table is queried,
    at present it just returns the difference between current_time and
    start_time.  This is incorrect. It should be worker_idle_time +
    (current_time - start_time).
    
    For example a worker thread was idle for 10 seconds and then it got events
    to process. Upon completion it goes to idle state, now if the pfs table is
    queried it should return current_idle time  + worker_idle_time.
    fe945067
rpl_parallel.h 15.1 KB