Commit 899b967f authored by Tim Peters's avatar Tim Peters

Port from ZODB 3.2.

Fixed several thread and asyncore races in ZEO's connection dance.

ZEO/tests/ConnectionTests.py
    The pollUp() and pollDown() methods were pure busy loops whenever
    the asyncore socket map was empty, and at least on some flavors of
    Linux that starved the other thread(s) trying to do real work.
    This grossly increased the time needed to run tests using these, and
    sometimes caused bogus "timed out" test failures.

ZEO/zrpc/client.py
ZEO/zrpc/connection.py
    Renamed class ManagedConnection to ManagedClientConnection, for clarity.

    Moved the comment block about protocol negotiation from the guts of
    ManagedClientConnection to before the Connection base class -- the
    Connection constructor can't be understood without this context.  Added
    more words about the delicate protocol negotiation dance.

    Connection class:  made this an abstract base clase.  Derived classes
    _must_ implement the handshake() method.  There was really nothing in
    common between server and client wrt what handshake() needs to do, and
    it was confusing for one of them to use the base class handshake() while
    the other replaced handshake() completely.

    Connection.__init__:  It isn't safe to register with asyncore's socket
    map before special-casing for the first (protocol handshake) message is
    set up.  Repaired that.  Also removed the pointless "optionalness" of
    the optional arguments.

    ManagedClientConnection.__init__:  Added machinery to set up correct
    (thread-safe) message queueing.  There was an unrepairable hole before,
    in the transition between "I'm queueing msgs waiting for the server
    handshake" and "I'm done queueing messages":  it was impossible to know
    whether any calls to the client's "queue a message" method were in
    progress (in other threads), so impossible to make the transition safely
    in all cases.  The client had to grow its own message_output() method,
    with a mutex protecting the transition from thread races.

    Changed zrpc-conn log messages to include "(S)" for server-side or
    "(C)" for client-side.  This is especially helpful for figuring out
    logs produced while running the test suite (the server and client
    log messages end up in the same file then).
parent 0dcb109d
What's new in ZODB3 3.3.1a2?
============================
Release date: DD-MMM-2005
ZEO
---
Repaired subtle race conditions in establishing ZEO connections, both client-
and server-side. These account for intermittent cases where ZEO failed
to make a connection (or reconnection), accompanied by a log message showing
an error caught in ``asyncore`` and having a traceback ending with:
``UnpicklingError: invalid load key, 'Z'.``
or:
``ZRPCError: bad handshake '(K\x00K\x00U\x0fgetAuthProtocol)t.'``
or:
``error: (9, 'Bad file descriptor')``
or an ``AttributeError``.
These were exacerbated when running the test suite, because of an unintended
busy loop in the test scaffolding, which could starve the thread trying to
make a connection. The ZEO reconnection tests may run much faster now,
depending on platform, and should suffer far fewer (if any) intermittent
"timed out waiting for storage to connect" failures.
What's new in ZODB3 3.3.1a1?
============================
Release date: 11-Jan-2005
......
......@@ -216,7 +216,7 @@ class CommonSetupTearDown(StorageTestBase):
def pollUp(self, timeout=30.0, storage=None):
if storage is None:
storage = self._storage
# Poll until we're connected
# Poll until we're connected.
now = time.time()
giveup = now + timeout
while not storage.is_connected():
......@@ -224,9 +224,15 @@ class CommonSetupTearDown(StorageTestBase):
now = time.time()
if now > giveup:
self.fail("timed out waiting for storage to connect")
# When the socket map is empty, poll() returns immediately,
# and this is a pure busy-loop then. At least on some Linux
# flavors, that can starve the thread trying to connect,
# leading to grossly increased runtime (typical) or bogus
# "timed out" failures. A little sleep here cures both.
time.sleep(0.1)
def pollDown(self, timeout=30.0):
# Poll until we're disconnected
# Poll until we're disconnected.
now = time.time()
giveup = now + timeout
while self._storage.is_connected():
......@@ -234,6 +240,8 @@ class CommonSetupTearDown(StorageTestBase):
now = time.time()
if now > giveup:
self.fail("timed out waiting for storage to disconnect")
# See pollUp() for why we sleep a little here.
time.sleep(0.1)
class ConnectionTests(CommonSetupTearDown):
......
......@@ -27,7 +27,7 @@ from ZODB.loglevels import BLATHER
from ZEO.zrpc.log import log
from ZEO.zrpc.trigger import trigger
from ZEO.zrpc.connection import ManagedConnection
from ZEO.zrpc.connection import ManagedClientConnection
class ConnectionManager(object):
"""Keeps a connection up over time"""
......@@ -476,8 +476,8 @@ class ConnectWrapper:
Call the client's testConnection(), giving the client a chance
to do app-level check of the connection.
"""
self.conn = ManagedConnection(self.sock, self.addr,
self.client, self.mgr)
self.conn = ManagedClientConnection(self.sock, self.addr,
self.client, self.mgr)
self.sock = None # The socket is now owned by the connection
try:
self.preferred = self.client.testConnection(self.conn)
......
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment