• Julien Muchembled's avatar
    Make TransactionMetaData in charge of (de)serializing extension data · 2f8cc67a
    Julien Muchembled authored
    IStorage implementations used to do this task themselves which leads to code
    duplication and sometimes bugs (one was fixed recently in NEO). Like for object
    serialization, this should be done by the upper layer (Connection).
    
    This commit also provides a way to get raw extensions data while iterating
    over transactions (this is actually the original purpose[2]). So far, extension
    data could only be retrieved unpickled, which caused several problems:
    
    - tools like `zodb dump` [1] cannot dump data exactly as stored on a
      storage. This makes database potentially not bit-to-bit identical to
      its original after restoring from such dump.
    
    - `zodb dump` output could be changing from run to run on the same
      database. This comes from the fact that e.g. python dictionaries are
      unordered and so when pickling a dict back to bytes the result could
      be not the same as original.
    
      ( this problem can be worked-around partly to work reliably for e.g.
        dict with str keys - by always emitting items in key sorted order,
        but it is hard to make it work reliably for arbitrary types )
    
    Both issues make it hard to verify integrity of database at the lowest
    possible level after restoration, and make it hard to verify bit-to-bit
    compatibility with non-python ZODB implementations.
    
    For this, TransactionMetaData gets a new 'extension_bytes' attribute and
    and common usage becomes:
    
    * Application committing a transaction:
    
      - 'extension' is set with a dictionary
      - the storage gets the bytes via 'extension_bytes'
    
    * Iteration:
    
      - the storage passes bytes as 'extension' parameter of TransactionMetaData
      - the application can get extension data either as bytes ('extension_bytes')
        or deserialized ('extension'): in the former case, no deserialization
        happens and the returned value is exactly what was passed by the storage
    
    [1] https://lab.nexedi.com/nexedi/zodbtools
    [2] https://github.com/zopefoundation/ZODB/pull/183Co-Authored-By: Kirill Smelkov's avatarKirill Smelkov <kirr@nexedi.com>
    2f8cc67a
BaseStorage.py 12.6 KB