Folder: Make recursiveReindexObject scalable by calling _recurseCallMethod.
Should make Folder_reindexAll and most custom indexation methods obsolete. Remaining valid reindexation methods are: - reindexObject: for a single document, which may contain subdocuments which indexation is not necessary - recursiveReindexobject: for any subtree of documents - ERP5Site_reindexAll: for site-wide reindexations, as there is a semantic- dependent indexation order. Also, uniformise and factorise spawning immediateReindexObject. Also: - testSupply: Drop check for the previous magic threshold. _recurseCallMethod takes care of it all now. - testXMLMatrix: Let activities execute before changing cell id. This works only because recursiveReindexObject on the matrix spawns a single recursiveImmediateReindexObject activity on that context. Now, up to 1k immediateReindexObject activities (for the first 1k sub-objects) are spawned immediately, preventing their renaming immediately after commit. So let test wait for indexation before trying to rename. - testERP5Security: More activities are now spawned immediately, adapt.
-
Owner
With this commit, some objects that were not indexed are now indexed, and this fails when these objects can't be indexed (maybe for bad reasons).
We found this for a PythonScript that's stored inside a Web Site:
Traceback (innermost last): Module Products.CMFActivity.ActivityTool, line 1360, in invokeGroup traverse(method_id)(expanded_object_list) Module Products.ERP5Catalog.CatalogTool, line 939, in catalogObjectList super(CatalogTool, self).catalogObjectList(tmp_object_list, **m.kw) Module Products.ZSQLCatalog.ZSQLCatalog, line 828, in catalogObjectList catalog_value=catalog, Module Products.ERP5Catalog.CatalogTool, line 875, in wrapObjectList - __traceback_info__: (<PythonScript at /tristan_test/web_site_module/nexedi_global/WebSite_setSkin>,) and document_object._getAcquireLocalRoles(): AttributeError: _getAcquireLocalRoles
Traceback of the activity creation:
... File "product/ERP5Type/Core/Folder.py", line 1244, in recursiveReindexObject skip_method_id='_isDocumentNonIndexable', File "product/ERP5Type/Core/Folder.py", line 520, in _recurseCallMethod recurse(self, 0) File "product/ERP5Type/Core/Folder.py", line 506, in recurse recurse(ob, depth + 1) File "product/ERP5Type/Core/Folder.py", line 517, in recurse method_id)(*method_args, **method_kw) File "product/CMFActivity/ActivityTool.py", line 546, in __call__ portal_activities=portal_activities,
What happens is that before, the recursion was done with:
if getattr(aq_base(c), 'recursiveReindexObject', None) is not None: c.recursiveReindexObject(**kw)
(the object is skipped because it does not have a
recursiveReindexObject
method)and now the recursion is done within
_recurseCallMethod
:skip_method_id
is only considered for the root object. Well, not exactly: I guess that if the recursion is split at such object, we'll get a AttributeError because PythonScript does not have_isDocumentNonIndexable
.So there's at least one bug in the recent changes of
_recurseCallMethod
: it must ignoreself
except for placeless things, or maybe also for the initial call (e.g.skip_method_id = kw.pop('skip_method_id', None)
).Do we want such object indexed?
/cc @tc
-
Owner
skip_method_id
is only considered for the root objectBut it acts on its argument (
document
, and notself
), which it the current recustion level. So I am not sure what you mean here. This was made this way to be consistent with how_recurseCallMethod
works: it is always called on the same context, and recursion depth & location is conveyed by its arguments.(e.g. skip_method_id = kw.pop('skip_method_id', None))
The intent is really to check on each level whether recursion should happen, which current code does AFAIK. So poping (so the check does not happen on recursed subtree) is not what I intend for this method.
For example, I want to be able to call
portal_trash.recursiveReindexObject()
, which will only reindexportal_trash/*
but notportal_trash/*/**
.EDIT: The reason why
portal_trash/*/**
is not indexed is thatTrashBin.isSubtreeIndexable
returnsFalse
: theTrashBin
s themselves are indexable, but none of their content is.Do we want such object indexed?
I have no fundamental objection to indexing these: scripts in websites have a canonical path which can be reconstructed anywhere (even if it has to involve the same acquisition workarounds as when editing web pages), unlike (famously) everything under
portal_skins
.So to me the fix is to exclude the whole subtree when encountering an object which cannot be indexed (as is visibly the case here).
The rationale for skipping the entire subtree here is that a non-indexable object does not have a (meaningful) uid, so parent_uid of their immediate children will be meaningless, so the subtree cannot be correctly indexed.
The next level of the fix, which is optional to me but I have nothing against it either, is to make these Python Scripts indexable.
-
Owner
About
skip_method_id
, I misread the code andkw.pop('skip_method_id', None)
would be wrong._isDocumentNonIndexable
is really called for the script, andisSubtreeIndexable
is acquired and returns True. I also made a mistake about what happens in case of split: contrary toexpand
(in simulation),_recurseCallMethod
is always resumed on the root object.So to me the fix is to exclude the whole subtree when encountering an object which cannot be indexed (as is visibly the case here).
Yes, and that can be done by just changing
_isDocumentNonIndexable
. Maybe:def _isDocumentNonIndexable(self, document): return (getattr(aq_base(document), 'isSubtreeIndexable', None) is None or not document.isSubtreeIndexable())
-
Owner
Looks good to me, yes.
I gave a shot at factorising such code pattern (get attribute on aq_base, then get attribute again on wrapped object), but I did not get very convincing results. I mostly found a heap of useless aq_base imports in the process.
-
Owner
I gave a shot at factorising such code pattern (get attribute on aq_base, then get attribute again on wrapped object), but I did not get very convincing results.
There's
aq_explicit
but it does not exist as a function so the object must be in an acquisition wrapper. Here, we could do:def _isDocumentNonIndexable(self, document): try: isSubtreeIndexable = document.aq_explicit.isSubtreeIndexable except AttributeError: return True return not isSubtreeIndexable()
-
Owner
There's aq_explicit
But then doesn't
self
(from insideisSubtreeIndexable
) also become an explicit acquisition wrapper ?I think we do not care for this spefic instance, as such methods should be simple enough without relying on implicit acquisition. For other uses (either other uses of this pattern - although I found fewer than I expected - or for possible future evolution of this use) it could be inappropriate.
-
Owner
You're right. That would be wrong to get methods.
ipdb> self <Person Module at /erp5/person_module> ipdb> self.portal_catalog <Catalog Tool at /erp5/portal_catalog used for /erp5/person_module> ipdb> self.aq_explicit.portal_catalog *** AttributeError: portal_catalog ipdb> self.aq_explicit.getObject.__self__ <Person Module at /erp5/person_module> ipdb> self.aq_explicit.getObject.__self__.portal_catalog *** AttributeError: portal_catalog