• anozdrin/alik@alik.opbmk's avatar
    Fix for BUG#24415: Instance manager test im_daemon_life_cycle fails randomly. · 76f813a5
    anozdrin/alik@alik.opbmk authored
    The cause of im_daemon_life_cycle.imtest random failures was the following
    behaviour of some implementations of LINUX threads: let's suppose that a
    process has several threads (in LINUX threads, there is a separate process for
    each thread). When the main process gets killed, the parent receives SIGCHLD
    before all threads (child processes) die. In other words, the parent receives
    SIGCHLD, when its child is not completely dead.
    
    In terms of IM, that means that IM-angel receives SIGCHLD when IM-main is not dead
    and still holds some resources. After receiving SIGCHLD, IM-angel restarts
    IM-main, but IM-main failed to initialize, because previous instance (copy) of
    IM-main still holds server socket (TCP-port).
    
    Another problem here was that IM-angel restarted IM-main only if it was killed
    by signal. If it exited with error, IM-angel thought it's intended / graceful
    shutdown and exited itself.
    
    So, when the second instance of IM-main failed to initialize, IM-angel thought
    it's intended shutdown and quit.
    
    The fix is
      1. to change IM-angel so that it restarts IM-main if it exited with error code;
      2. to change IM-main so that it returns proper exit code in case of failure.
    76f813a5
thread_registry.cc 7.36 KB