Commit b41dc8e0 authored by Vincent Pelletier's avatar Vincent Pelletier

SQLCatalog_deferFullTextIndex{,Activity}: Use serialization_tag .

Just like regular indexations, fulltext indexations are subject to last-commit-wins.
Which means that it is possible to reach a state where the fulltext table is
persistently desynchronised from ZODB:
- start fulltext indexation activity on many documents (typically: 100)
- modify one of the documents being indexed
- start fulltext indexation activity caused by this edit, and assume indexation only
happens for this object
- commit the single-object indexation (because it is very fast to retrieve fulltext
data from just one document)
- commit the many-objects indexation later (because it is much slower to
retrieve 100 fulltext representations)

As a consequence, cod emust spawn one fulltext indexation activity per
document, each with the appropriate serialisation tag. Serialisation tag which
must not conflict with regular indexation, so use a fixed prefix.
As a consequence of having to spawn one activity per document, use a
grouping method to still index by batches to amortise transaction overhead.
Keep the same method_id as before for backward-compatibility (maybe
dependencies on this value exist, even though it is bad practice).
Rewrite SQLCatalog_deferFullTextIndexActivity so ot works as a grouping
method, simplifying it in the process:
- build parameter_dict with all entries, as we already know all needed keys
- None is not callable, so test "not None" in just one expression
- remove whitespace at end of line
- use GroupedMessage API
parent 43d16f61
# This script is called to defer fulltext indexing in a lower priority. # This script is called to defer fulltext indexing in a lower priority.
context.activate(activity='SQLQueue', priority=4, group_method_id=None).SQLCatalog_deferFullTextIndexActivity(path_list=list(getPath)) GROUP_METHOD_ID = context.getPath() + '/SQLCatalog_deferFullTextIndexActivity'
for document_value, root_document_path in zip(getObject, getRootDocumentPath):
document_value.activate(
activity='SQLQueue',
priority=4,
group_method_id=GROUP_METHOD_ID,
serialization_tag='full_text_' + root_document_path,
).SQLCatalog_deferFullTextIndexActivity()
...@@ -50,7 +50,13 @@ ...@@ -50,7 +50,13 @@
</item> </item>
<item> <item>
<key> <string>_params</string> </key> <key> <string>_params</string> </key>
<value> <string>getPath</string> </value> <value> <string>getObject, getRootDocumentPath</string> </value>
</item>
<item>
<key> <string>description</string> </key>
<value>
<none/>
</value>
</item> </item>
<item> <item>
<key> <string>id</string> </key> <key> <string>id</string> </key>
......
...@@ -4,42 +4,28 @@ from zExceptions import Unauthorized ...@@ -4,42 +4,28 @@ from zExceptions import Unauthorized
method = context.z_catalog_fulltext_list method = context.z_catalog_fulltext_list
property_list = method.arguments_src.split() property_list = method.arguments_src.split()
parameter_dict = {} parameter_dict = {x: [] for x in property_list}
failed_path_list = [] for group_object in object_list:
restrictedTraverse = context.getPortalObject().restrictedTraverse obj = group_object.object
for path in path_list: tmp_dict = {}
if not path: # should happen in tricky testERP5Catalog tests only
continue
obj = restrictedTraverse(path, None)
if obj is None:
continue
try: try:
tmp_dict = {}
for property in property_list: for property in property_list:
getter = getattr(obj, property, None) getter = getattr(obj, property, None)
if getter is not None and callable(getter): if callable(getter):
value = getter() value = getter()
else: else:
value = getattr(obj, 'get%s' % UpperCase(property))() value = getattr(obj, 'get%s' % UpperCase(property))()
tmp_dict[property] = value tmp_dict[property] = value
except ConflictError: except ConflictError:
raise raise
except Unauthorized: # should happen in tricky testERP5Catalog tests only except Unauthorized: # should happen in tricky testERP5Catalog tests only
continue continue
except Exception, e: except Exception, e:
exception = e group_object.raised()
failed_path_list.append(path)
else:
for property, value in tmp_dict.items():
parameter_dict.setdefault(property, []).append(value)
if len(failed_path_list):
if len(parameter_dict):
# reregister activity for failed objects only
context.activate(activity='SQLQueue', priority=5).SQLCatalog_deferFullTextIndexActivity(path_list=failed_path_list)
else: else:
# if all objects are failed one, just raise an exception to avoid infinite loop. for property, value in tmp_dict.iteritems():
raise AttributeError, 'exception %r raised in indexing %r' % (exception, failed_path_list) parameter_dict[property].append(value)
group_object.result = None
if parameter_dict: if parameter_dict:
return method(**parameter_dict) return method(**parameter_dict)
...@@ -50,7 +50,7 @@ ...@@ -50,7 +50,7 @@
</item> </item>
<item> <item>
<key> <string>_params</string> </key> <key> <string>_params</string> </key>
<value> <string>path_list</string> </value> <value> <string>object_list</string> </value>
</item> </item>
<item> <item>
<key> <string>_proxy_roles</string> </key> <key> <string>_proxy_roles</string> </key>
...@@ -60,6 +60,12 @@ ...@@ -60,6 +60,12 @@
</tuple> </tuple>
</value> </value>
</item> </item>
<item>
<key> <string>description</string> </key>
<value>
<none/>
</value>
</item>
<item> <item>
<key> <string>id</string> </key> <key> <string>id</string> </key>
<value> <string>SQLCatalog_deferFullTextIndexActivity</string> </value> <value> <string>SQLCatalog_deferFullTextIndexActivity</string> </value>
......
  • mentioned in commit fc2d07de

    Toggle commit list
  • This only fixes erp5_full_text_mroonga_catalog. What about other full_text BT ?

    /cc @kazuhiko

  • Other fulltext BTs should rather be deleted. They are inferior to mroonga in at least CJK support, so I see supporting them as a waste of time.

Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment