lib/tests/test_zodb.py · 542917d1382a14415914bc22877d6571a04b6f3c · Klaus Wölfel / wendelin.core

bigfile/zodb/ZBlk1: Don't miss to deactivate/free internal .chunktab buckets in loadblkdata() · 542917d1

Kirill Smelkov authored Aug 14, 2016

13c0c17c (bigfile/zodb: Format #1 which is optimized for small changes)
used BTree to organize ZBlk1 block's chunks and for loadblkdata() added
"TODO we are missing to free internal BTree structures on data load".

nexedi/wendelin.core#3 besides other
things showed that even when we deactivate ZData objects, we are still
keeping them as ghosts occupying memory and the same for IOBucket
objects.

This all happens because there is no proper way to deactivate whole
btree - including internal buckets objects. And since internal buckets
are not deactivated, they stay in picklecache and thus hold a reference
to ZData objects and ZData objects in turn, even if explicitly
deactivated, stay in memory.

We can fix this all via implementing whole-btree deactivation procedure.

To do so we need to iterate over all btree buckets recursively, but
unfortunately there is no BTree API to access/iterate btree's buckets.
We can however still get reference to first top-level buckets via
gc.get_referents(btree) and then scan buckets further without hacks.

gc.get_referents(btree) is a hack, but

- it works in O(1)  (we only get pointers from btree, not scanning all
  gcable objects and deducing them)
- it works reliable if we filter out non-interesting objects.

So in the end it works.

Before the patch loading more and more ZBlk1 data with objgraph
instrumentation was showing itself like

    #                                    Nobj        δ
    wendelin.bigfile.file_zodb.ZData     7168      +512
    BTrees.IOBTree.IOBucket               238       +17
    BTrees.IOBTree.IOBTree                 14        +1

and after this patch we now have

    BTrees.IOBTree.IOBTree                 14        +1

we cannot remove that "IOBTree + 1", since ZBlk1 is holding direct
reference on it (via .chunktab) and we have to keep ZBlk1 live with
._v_zfile and ._v_zblk set for invalidation to work. "+1 IOBtree" is
however small - 144 bytes per 2M (= 0.006%) so we can neglect that the
same way we neglect keeping ZBlk1 staying live for each block.

542917d1

test_zodb.py 2.48 KB

Replace test_zodb.py