• Sujatha's avatar
    MDEV-19076: rpl_parallel_temptable result mismatch '-33 optimistic' · 5a2110e7
    Sujatha authored
    Problem:
    ========
    The test now fails with the following trace:
    
    CURRENT_TEST: rpl.rpl_parallel_temptable
    --- /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.result
    +++ /mariadb/10.4/mysql-test/suite/rpl/r/rpl_parallel_temptable.reject
    @@ -194,7 +194,6 @@
     30    conservative
     31    conservative
     32    optimistic
    -33    optimistic
    
    Analysis:
    =========
    The part of test which fails with result content mismatch is given below.
    
    CREATE TEMPORARY TABLE t4 (a INT PRIMARY KEY) ENGINE=InnoDB;
    INSERT INTO t4 VALUES (32);
    INSERT INTO t4 VALUES (33);
    INSERT INTO t1 SELECT a, "optimistic" FROM t4;
    
    slave_parallel_mode=optimistic
    
    The expectation of the above test script is, INSERT FROM SELECT should read both
    32, 33 and populate table 't1'. But this expectation fails occasionally.
    
    All three INSERT statements are handed over to three different slave parallel
    workers. Temporary tables are not safe for parallel replication. They were
    designed to be visible to one thread only, so have no table locking.  Thus there
    is no protection against two conflicting transactions committing in parallel and
    things like that.
    
    So anything that uses temporary tables will be serialized with anything before
    it, when using parallel replication by using a "wait_for_prior_commit" function
    call. This will ensure that the each transaction is executed sequentially.
    
    But there exists a code path in which the above wait doesn't happen.  Because of
    this at times INSERT from SELECT doesn't wait for the INSERT (33) to complete
    and it completes its executes and enters commit stage.  Hence only row 32 is
    found in those cases resulting in test failure.
    
    The wait needs to be added within "open_temporary_table" call. The code looks
    like this within "open_temporary_table".
    
    Each thread tries to open temporary table in 3 different ways:
    
    case 1: Find a temporary table which is already in use by using
             find_temporary_table(tl) && wait_for_prior_commit()
    case 2: If above failed then try to look for temporary table which is marked for
            free for reuse. This internally calls "wait_for_prior_commit()" if table
            is found.
             find_and_use_tmp_table(tl, &table)
    case 3: If none of the above open a new table handle from table share.
             if (!table && (share= find_tmp_table_share(tl)))
             { table= open_temporary_table(share, tl->get_table_name(), true); }
    
    At present the "wait_for_prior_commit" happens only in case 1 & 2.
    
    Fix:
    ====
    On slave add a call for "wait_for_prior_commit" for case 3.
    
    The above wait on slave will solve the issue. A more detailed fix would be to
    mark temporary tables as not safe for parallel execution on the master side.
    In order to do that, on the master side, mark the Gtid_log_event specific flag
    FL_TRANSACTIONAL to be false all the time. So that they are not scheduled
    parallely.
    5a2110e7
temporary_tables.cc 42.3 KB