CMFActivity.Activity.SQLBase: Reduce the number of deadlocks

MariaDB seems to be using inconsistent lock acquisition order when executing the activity reservation queries. As a consequence, it produces internal deadlocks, which it detects. Upon detection, it kills one of the involved query, which causes message reservation to fail, despite the presence of executable activities. To avoid depending on MariaDB internal lock acquisition order, acquire an explicit table-scoped lock before running the activity reservation queries. On an otherwise-idle 31 processing node cluster with the following activities spawned, designed to stress activity reservation queries (many ultra-short activities being executed one at a time): active_getTitle = context.getPortalObject().portal_catalog.activate( activity='SQLQueue', priority=5, tag='foo', ).getTitle for _ in xrange(40000): active_getTitle() the results are: - a 26% shorter activity execution time: from 206s with the original code to 152s - a 100% reduction in reported deadlocks from 300 with the original code to 0 There is room for further improvements at a later time: - tweaking the amount of time spent waiting for this new lock to be available, set for now at 1s. - possibly bypassing this lock altogether when there are too few processing nodes simultaneously enabled, or even in an adaptive reaction to deadlock errors actually happening. - cover more write accesses to these tables with the same lock From a production environment, it appears that the getReservedMessageList method alone is involved in 95% of these deadlocks, so for now this change only targets this method.

CMFActivity.Activity.SQLBase: Reduce the number of deadlocks
MariaDB seems to be using inconsistent lock acquisition order when executing the activity reservation queries. As a consequence, it produces internal deadlocks, which it detects. Upon detection, it kills one of the involved query, which causes message reservation to fail, despite the presence of executable activities. To avoid depending on MariaDB internal lock acquisition order, acquire an explicit table-scoped lock before running the activity reservation queries. On an otherwise-idle 31 processing node cluster with the following activities spawned, designed to stress activity reservation queries (many ultra-short activities being executed one at a time): active_getTitle = context.getPortalObject().portal_catalog.activate( activity='SQLQueue', priority=5, tag='foo', ).getTitle for _ in xrange(40000): active_getTitle() the results are: - a 26% shorter activity execution time: from 206s with the original code to 152s - a 100% reduction in reported deadlocks from 300 with the original code to 0 There is room for further improvements at a later time: - tweaking the amount of time spent waiting for this new lock to be available, set for now at 1s. - possibly bypassing this lock altogether when there are too few processing nodes simultaneously enabled, or even in an adaptive reaction to deadlock errors actually happening. - cover more write accesses to these tables with the same lock From a production environment, it appears that the getReservedMessageList method alone is involved in 95% of these deadlocks, so for now this change only targets this method.
18b5e4ed · Vincent Pelletier · 08960254 · 18b5e4ed
Commit 18b5e4ed authored Sep 17, 2021 by Vincent Pelletier
Show whitespace changes
Inline Side-by-side

Showing with 72 additions and 33 deletions

product/CMFActivity/Activity/SQLBase.py product/CMFActivity/Activity/SQLBase.py +72 -33

No files found.
--- a/product/CMFActivity/Activity/SQLBase.py
+++ b/product/CMFActivity/Activity/SQLBase.py
@@ -28,6 +28,7 @@ from __future__ import absolute_import
 ##############################################################################

 from collections import defaultdict
+from contextlib import contextmanager
 from itertools import product
 import operator
 import sys
@@ -83,6 +84,24 @@ def render_datetime(x):
 _SQLTEST_NO_QUOTE_TYPE_SET = int, float, long
 _SQLTEST_NON_SEQUENCE_TYPE_SET = _SQLTEST_NO_QUOTE_TYPE_SET + (DateTime, basestring)

+@contextmanager
+def SQLLock(db, lock_name, timeout):
+  """
+  Attemp to acquire a named SQL lock. The outcome of this acquisition is
+  returned to the context statement and MUST be checked:
+  1: lock acquired
+  0: timeout
+  """
+  lock_name = db.string_literal(lock_name)
+  query = db.query
+  (_, ((acquired, ), )) = query('SELECT GET_LOCK(%s, %f)' % (lock_name, timeout))
+  if acquired is None:
+    raise ValueError('Error acquiring lock')
+  try:
+    yield acquired
+  finally:
+    if acquired:
+      query('SELECT RELEASE_LOCK(%s)' % (lock_name, ))
 # sqltest_dict ({'condition_name': <render_function>}) defines how to render
 # condition statements in the SQL query used by SQLBase.getMessageList
 def sqltest_dict():
@@ -648,6 +667,26 @@ CREATE TABLE %s (
            ' AND group_method_id=' + quote(group_method_id)
            if group_method_id else '' , limit)

+    # Note: Not all write accesses to our table are protected by this lock.
+    # This lock is not here for data consistency reasons, but to avoid wasting
+    # time on SQL deadlocks caused by the varied lock ordering chosen by the
+    # database. These queries specifically seem to be extremely prone to such
+    # deadlocks, so prevent them from attempting to run in parallel on a given
+    # activity table.
+    # If more accesses are found to cause a significant waste of time because
+    # of deadlocks, then they should acquire such lock as well. But
+    # preemptively applying such lock everywhere without checking the amount
+    # of waste is unlikely to produce a net gain.
+    # XXX: timeout may benefit from being tweaked, but one second seem like a
+    # reasonable starting point.
+    # XXX: locking could probably be skipped altogether on clusters with few
+    # enough processing nodes, as there should be little deadlocks and the
+    # tradeoff becomes unfavorable to explicit locks. What threshold to
+    # choose ?
+    with SQLLock(db, self.sql_table, timeout=1) as acquired:
+      if not acquired:
+        # This table is busy, check for work to do elsewhere
+        return ()
      # Get reservable messages.
      # During normal operation, sorting by date (as last criteria) is fairer
      # for users and reduce the probability to do the same work several times