• Guido van Rossum's avatar
    In wait(), when there's no asyncore main loop, we called · 8cba5055
    Guido van Rossum authored
    asyncore.poll() with a timeout of 10 seconds.  Change this to a
    variable timeout starting at 1 msec and doubling until 1 second.
    
    While debugging Win2k crashes in the check4ExtStorageThread test from
    ZODB/tests/MTStorage.py, Tim noticed that there were frequent 10
    second gaps in the log file where *nothing* happens.  These were caused
    by the following scenario.
    
    Suppose a ZEO client process has two threads using the same connection
    to the ZEO server, and there's no asyncore loop active.  T1 makes a
    synchronous call, and enters the wait() function.  Then T2 makes
    another synchronous call, and enters the wait() function.  At this
    point, both are blocked in the select() call in asyncore.poll(), with
    a timeout of 10 seconds (in the old version).  Now the replies for
    both calls arrive.  Say T1 wakes up.  The handle_read() method in
    smac.py calls self.recv(8096), so it gets both replies in its buffer,
    decodes both, and calls self.message_input() for both, which sticks
    both replies in the self.replies dict.  Now T1 finds its response, its
    wait() call returns with it.  But T2 is still stuck in
    asyncore.poll(): its select() call never woke up, and has to "sit out"
    the whole timeout of 10 seconds.  (Good thing I added timeouts to
    everything!  Or perhaps not, since it masked the problem.)
    
    One other condition must be satisfied before this becomes a disaster:
    T2 must have started a transaction, and all other threads must be
    waiting to start another transaction.  This is what I saw in the log.
    (Hmm, maybe a message should be logged when a thread is waiting to
    start a transaction this way.)
    
    In a real Zope application, this won't happen, because there's a
    centralized asyncore loop in a separate thread (probably the client's
    main thread) and the various threads would be waiting on the condition
    variable; whenever a reply is inserted in the replies dict, all
    threads are notified.  But in the test suite there's no asyncore loop,
    and I don't feel like adding one.  So the exponential backoff seems
    the easiest "solution".
    8cba5055
connection.py 16.7 KB