• Kirill Smelkov's avatar
    loadAt · eb02632c
    Kirill Smelkov authored
    loadAt is new optional storage interface that is intended to replace loadBefore
    with more clean and uniform semantic. Compared to loadBefore, loadAt:
    
    1) returns data=None and serial of the removal, when loaded object was found to
       be deleted. loadBefore is returning only data=None in such case. This loadAt
       property allows to fix DemoStorage data corruption when whiteouts in overlay
       part were not previously correctly taken into account.
    
       https://github.com/zopefoundation/ZODB/issues/318
    
    2) for regular data records, does not require storages to return next_serial,
       in addition to (data, serial). loadBefore requirement to return both
       serial and next_serial is constraining storages unnecessarily, and,
       while for FileStorage it is free to implement, for other storages it is
       not - for example for NEO and RelStorage, finding out next_serial, after
       looking up oid@at data record, costs one more SQL query:
    
       https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
       https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482
    
       https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
       https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199
    
       next_serial is not only about execution overhead - it is semantically
       redundant to be there and can be removed from load return. The reason
       I say that next_serial can be removed is that in ZODB/py the only place,
       that I could find, where next_serial is used on client side is in client
       cache (e.g. in NEO client cache), and that cache can be remade to
       work without using that next_serial at all. In simple words whenever
       after
    
         loadAt(oid, at)  ->  (data, serial)
    
       query, the cache can remember data for oid in [serial, at] range.
    
       Next, when invalidation message from server is received, cache entries,
       that had at == client_head, are extended (at -> new_head) for oids that
       are not present in invalidation message, while for oids that are present
       in invalidation message no such extension is done. This allows to
       maintain cache in correct state, invalidate it when there is a need to
       invalidate, and not to throw away cache entries that should remain live.
       This of course requires ZODB server to include both modified and
       just-created objects into invalidation messages
    
         ( https://github.com/zopefoundation/ZEO/pull/160 ,
           https://github.com/zopefoundation/ZODB/pull/319 ).
    
       Switching to loadAt should thus allow storages like NEO and, maybe,
       RelStorage, to do 2x less SQL queries on every object access.
    
       https://github.com/zopefoundation/ZODB/issues/318#issuecomment-657685745
    
    In other words loadAt unifies return signature to always be
    
       (data, serial)
    
    instead of
    
       POSKeyError				object does not exist at all
       None					object was removed
       (data, serial, next_serial)		regular data record
    
    used by loadBefore.
    
    This patch:
    
    - introduces new interface.
    - introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
      or, if the storage does not implement loadAt interface, tries to mimic
      loadAt semantic via storage.loadBefore to possible extent + emits
      corresponding warning.
    - converts MVCCAdapter to use loadAt instead of loadBefore.
    - changes DemoStorage to use loadAt, and this way fixes above-mentioned
      data corruption issue; adds corresponding test; converts
      DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
    - adds loadAt implementation to FileStorage and MappingStorage.
    - adapts other tests/code correspondingly.
    
    /cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
    eb02632c
interfaces.py 55.2 KB