• Elena Stepanova's avatar
    MDEV-10100 main.pool_of_threads fails sporadically in buildbot · 3871477c
    Elena Stepanova authored
    The patch fixes two test failures:
    - on slow builders, sometimes a connection attempt which should
      fail due to the exceeded number of thread_pool_max_threads
      actually succeeds;
    - on even slow builders, MTR sometimes cannot establish the
      initial connection, and check-testcase fails prior to the
      test start
    
    The problem with check-testcase was caused by connect-timeout=2
    which was set for all clients in the test config file. On slow
    builders it might be not enough.
    There is no way to override it for the pre-test check, so it needed
    to be substantially increased or removed.
    
    The other problem was caused by a race condition between sleeps
    that the test performs in existing connections and the connect
    timeout for the connection attempt which was expected to fail.
    If sleeps finished before the connect-timeout was exceeded, it
    would allow the connection to succeed.
    
    To solve each problem without making the other one worse,
    connect-timeout should be configured dynamically during the test.
    Due to the nature of the test (all connections must be busy
    at the moment when we need to change the timeout, and cannot execute
    SET GLOBAL ...), it needs to be done independently from the server.
    
    The solution:
    - recognize 'connect_timeout' as a connection option in mysqltest's
      "connect" command;
    - remove connect-timeout from the test configuration file;
    - use the new connect_timeout option for those connections which
      are expected to fail;
    - re-arrange the test flow to allow running a huge SLEEP
      without affecting the test execution time (because it would be
      interrupted after the main test flow is finished).
    
    The test is still subject to false negatives, e.g. if the connection
    fails due to timeout rather than due to the exceeded number of
    allowed threads, or if the connection on extra port succeeds due
    to a race condition and not because the special logic for the extra
    port. But those false negatives have always been possible there
    on slow builders, they should not be critical because faster builders
    should catch such failures if they appear.
    3871477c
pool_of_threads.result 56.5 KB